Hi,
I'm testing a new implementation of gemfire's region in my server system, and I found some issues in a specific scenario.
The ideas is the next, my server system have a region called transaction whit the next config:
<!-- The attributes for transaction data region --> |
<region-attributes id="TRANSACTION" disk-store-name="transaction-disk-store"
disk-synchronous="true" scope="distributed-ack" data-policy="persistent-replicate" | |
concurrency-level="32" statistics-enabled="true" enable-gateway="true" | |
hub-id="GIRE-DATABASE-1"> | |
<subscription-attributes interest-policy="all" /> |
</region-attributes>
<!-- When a payment comes to the server, the gateway puts it in this region. | |
This region immediately transfers it to the gateway so it can be asynchronously | |
processed --> |
<region name="transaction" refid="TRANSACTION">
</region>
I have two node (servers) in my system and in this region a transaction is inserted when a client send it by a gateway.
Then, by the same gateway I perform a wtrite-behind of this transactions in my oracle db, using a GatewayEventListener.
This is the config of the gateway:
<!-- This is the gateway we receive payments on. A port must be configured | |
for the gateway. We are starting gateway hub in server initializer with custom | |
port if need --> |
<gateway-hub id="GIRE-DATABASE-1" port="11111"
manual-start="true"> | |||
<gateway id="GIRE-DATABASE-1"> | |||
<gateway-listener> | |||
<class-name>com.gire.rp.server.infrastructure.listener.TransactionGatewayHubListener</class-name> | |||
</gateway-listener> | |||
<gateway-queue disk-store-name="transaction-queue-disk-store" | |||
maximum-queue-memory="256" batch-size="1100" batch-time-interval="10000" | |||
batch-conflation="false" enable-persistence="true" /> | |||
</gateway> |
</gateway-hub>
And in this line I start up the gateway in the server:
void setUpAndStartGatewayHub(String gatewayHubId, int port) |
{
GatewayHub gatewayHub = cache.getGatewayHub(gatewayHubId); | ||
if (gatewayHub == null) | ||
{ | ||
throw new ServerInicializationException("No gateway hub found for id " + gatewayHubId); | ||
} | ||
gatewayHub.setPort(port); | ||
tryStartGatewayHub(gatewayHub); |
}
void tryStartGatewayHub(GatewayHub gatewayHub)
{
gatewayHub.setStartupPolicy(GatewayHub.DEFAULT_STARTUP_POLICY); | ||
try | ||
{ | ||
gatewayHub.start(); | ||
} | ||
catch (IOException e) | ||
{ | ||
throw new ServerInicializationException("Failed to start gateway hub " + gatewayHub.getId() + " on port " + serverProperties.getTransactionGatewayHubPort(), e); | ||
} |
}
The problem that i found appear in the next scenario.
1) I started the two servers.
2) I started my client
3) I sent 2500 transaction and whilebeing sent down aserverand then the other. The two server are down, and the transactionscontinue to be generatedon the client.
4) I down the client
5) I watched my logs and I found that I recived 1896 transaction in the batch of GatewayEventListener and saved 44 in the oracle. I watched in the oracle and i found this 44.
6) I started one server, and watch in the gfmonitor that in the region transaction I have 811 transactions, I watched the oracle database and I found only 642 transactions
7) I started the other server, and never change.
8) I started the client, I watch in the gfmonitor that the region transaction i have 2500 transactions, but in the oracle database i have 2331.
This is my problems, in the logs i can see that there is events that never get the GatewayEventListener, for this reason in my region of gemfire i have all the transactions but in the oracle no.
I find this line in the first server log down:
[warning 2014/01/09 13:46:58.809 ART gem-r35-01.gire.com <main> tid=0x1] GatewayEventProcessor[gatewayId=GIRE-DATABASE-1;gatewayHubId=GIRE-DATABASE-1;diskStoreName=transaction-queue-disk-store;batchSize=1100;batchTimeInterval=10000;batchConflation=false;enablePersistence=true]:Dispatcher still alive even after join of 5 seconds.
[warning 2014/01/09 13:46:58.810 ART gem-r35-01.gire.com <main> tid=0x1] Destroying GatewayEventDispatcher with actively queued data.
This can explain that but
how I can solve that?
thanks
Juan