[CAM-9562] External tasks are not unlocked in case of orphaned async fetch and lock requests Created: 29/Nov/18  Updated: 28/Feb/19  Resolved: 22/Jan/19

Status: Closed
Project: camunda BPM
Component/s: engine
Affects Version/s: 7.10.0, 7.9.6, 7.11.0
Fix Version/s: 7.11.0, 7.10.2, 7.9.9, 7.11.0-alpha1

Type: Bug Report Priority: L3 - Default
Reporter: Tassilo Weidner Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: SUPPORT
Remaining Estimate: 0 minutes
Time Spent: Not Specified
Original Estimate: 0 minutes

Issue Links:
Depedendency

 Description   

Steps to reproduce

  1. A task executor sends an async 'fetch and lock' request to the REST API with an asyncResponseTimeout
  2. The REST API queues the request as no external tasks are available to fetch and lock; The connection is kept alive
  3. The task executor does not wait for the async response and cancels the connection
  4. An external task instance is started

Observed behavior
The external task instance is fetched and locked although no client is actively waiting for it.

Expected behavior
The fetched and locked external task is unlocked if the response could not be sent back to the client as the client-side socket has been cleaned-up by the operating system.

Hints
The server can only unlock the fetched and locked external task if the client-side socket has been cleaned-up by the kernel. If the client-side socket is in state FIN_WAIT_2, the server completes the cancelled request successfully. This is an operating system specific and beyond control of Camunda BPM.



 Comments   
Comment by Smirnov Roman [ 19/Dec/18 ]

Possible solution to avoid this: Add a configuration flag that indicates orphaned requests can be cancelled when they have the same worker id.

Comment by Thorben Lindhauer [ 14/Jan/19 ]

Implemented solution: There is a configuration flag in web.xml called fetch-and-lock-unique-worker-request that if activated ensures that only at most one request per worker id is present at all times. That way, orphaned requests are cleared as soon as the worker re-connects. This can be combined with manually unlocking tasks when re-connecting to solve the overall problem that tasks are locked for a worker that never receives the task.

Generated at Thu Aug 22 11:52:06 CEST 2019 using JIRA 6.4.6#64021-sha1:33e5b454af4594f54560ac233c30a6e00459507e.