Uploaded image for project: 'camunda BPM'
  1. camunda BPM
  2. CAM-9612

BulkFetch complex/object-Variable values in context of historic detail data

    Details

      Description

      AT:

      • given:
        • I have more than 100 million variables in the Camunda engine
        • at least 10 % of the variables are complex variables (e.g. JSON or XML)
      • when:
        • I fetch 10 000 historic variable updates from the historic detail table using the Optimize rest api or the historic detail endpoint
      • then:
        • it does not take more than 2 seconds
      • such that:
        • Even if the user has a lot of complex variables, Optimize can import the data in a very fast manner

      Hints:
      In the context of bulk fetching historic variable updates from the historic detail table the current implementation of e.g. HistoricDetailQueryImpl#executeList and OptimizeHistoricVariableUpdateQueryCmd#fetchVariableValues sequentially calls getTypedValue for each variable entry which will perform one additional query per complex/object variable in AbstractSerializableValueSerializer#readValue to resolve the actual value.

      This doesn't scale with maxResults values in the magnitude of several thousand as used by optimize in the context of importing data e.g.
      maxResults -> response time
      100 -> 400ms
      200 -> 640ms
      500 -> 1s
      1000 -> 2s
      10_000 -> 20s

      We need a better scaling implementation of fetching variables for this usecase that e.g. fetches all byteArray entries in one bulk.

        Activity

        sebastian.bathke Sebastian Bathke created issue -
        johannes.heinemann Johannes Heinemann made changes -
        Field Original Value New Value
        Summary BulkFetch complex/object-Variable values in context of historic data BulkFetch complex/object-Variable values in context of historic detail data
        johannes.heinemann Johannes Heinemann made changes -
        Description In the context of bulk fetching historic variable data the current implementation of e.g. HistoricDetailQueryImpl#executeList and OptimizeHistoricVariableUpdateQueryCmd#fetchVariableValues sequentially calls getTypedValue for each variable entry which will perform one additional query per complex/object variable in AbstractSerializableValueSerializer#readValue to resolve the actual value.

        This doesn't scale with maxResults values in the magnitude of several thousand as used by optimize in the context of importing data e.g.
        maxResults -> response time
        100 -> 400ms
        200 -> 640ms
        500 -> 1s
        1000 -> 2s
        10_000 -> 20s

        We need a better scaling implementation of fetching variables for this usecase that e.g. fetches all byteArray entries in one bulk.
        In the context of bulk fetching historic variable updates from the historic detail table the current implementation of e.g. HistoricDetailQueryImpl#executeList and OptimizeHistoricVariableUpdateQueryCmd#fetchVariableValues sequentially calls getTypedValue for each variable entry which will perform one additional query per complex/object variable in AbstractSerializableValueSerializer#readValue to resolve the actual value.

        This doesn't scale with maxResults values in the magnitude of several thousand as used by optimize in the context of importing data e.g.
        maxResults -> response time
        100 -> 400ms
        200 -> 640ms
        500 -> 1s
        1000 -> 2s
        10_000 -> 20s

        We need a better scaling implementation of fetching variables for this usecase that e.g. fetches all byteArray entries in one bulk.
        johannes.heinemann Johannes Heinemann made changes -
        Description In the context of bulk fetching historic variable updates from the historic detail table the current implementation of e.g. HistoricDetailQueryImpl#executeList and OptimizeHistoricVariableUpdateQueryCmd#fetchVariableValues sequentially calls getTypedValue for each variable entry which will perform one additional query per complex/object variable in AbstractSerializableValueSerializer#readValue to resolve the actual value.

        This doesn't scale with maxResults values in the magnitude of several thousand as used by optimize in the context of importing data e.g.
        maxResults -> response time
        100 -> 400ms
        200 -> 640ms
        500 -> 1s
        1000 -> 2s
        10_000 -> 20s

        We need a better scaling implementation of fetching variables for this usecase that e.g. fetches all byteArray entries in one bulk.
        *AT:*
        * given:
        ** I have more than 100 million variables in the Camunda engine
        ** at least 10 % of the variables are complex variables (e.g. JSON or XML)
        * when:
        ** I fetch 10 000 historic variable updates from the historic detail table using the Optimize rest api or the historic detail endpoint
        * then:
        ** it does not take more than 2 seconds
        * such that:
        ** Even if the user has a lot of complex variables, Optimize can import the data in a very fast manner

        *Context:*
        In the context of bulk fetching historic variable updates from the historic detail table the current implementation of e.g. HistoricDetailQueryImpl#executeList and OptimizeHistoricVariableUpdateQueryCmd#fetchVariableValues sequentially calls getTypedValue for each variable entry which will perform one additional query per complex/object variable in AbstractSerializableValueSerializer#readValue to resolve the actual value.

        This doesn't scale with maxResults values in the magnitude of several thousand as used by optimize in the context of importing data e.g.
        maxResults -> response time
        100 -> 400ms
        200 -> 640ms
        500 -> 1s
        1000 -> 2s
        10_000 -> 20s

        We need a better scaling implementation of fetching variables for this usecase that e.g. fetches all byteArray entries in one bulk.
        johannes.heinemann Johannes Heinemann made changes -
        Description *AT:*
        * given:
        ** I have more than 100 million variables in the Camunda engine
        ** at least 10 % of the variables are complex variables (e.g. JSON or XML)
        * when:
        ** I fetch 10 000 historic variable updates from the historic detail table using the Optimize rest api or the historic detail endpoint
        * then:
        ** it does not take more than 2 seconds
        * such that:
        ** Even if the user has a lot of complex variables, Optimize can import the data in a very fast manner

        *Context:*
        In the context of bulk fetching historic variable updates from the historic detail table the current implementation of e.g. HistoricDetailQueryImpl#executeList and OptimizeHistoricVariableUpdateQueryCmd#fetchVariableValues sequentially calls getTypedValue for each variable entry which will perform one additional query per complex/object variable in AbstractSerializableValueSerializer#readValue to resolve the actual value.

        This doesn't scale with maxResults values in the magnitude of several thousand as used by optimize in the context of importing data e.g.
        maxResults -> response time
        100 -> 400ms
        200 -> 640ms
        500 -> 1s
        1000 -> 2s
        10_000 -> 20s

        We need a better scaling implementation of fetching variables for this usecase that e.g. fetches all byteArray entries in one bulk.
        *AT:*
        * given:
        ** I have more than 100 million variables in the Camunda engine
        ** at least 10 % of the variables are complex variables (e.g. JSON or XML)
        * when:
        ** I fetch 10 000 historic variable updates from the historic detail table using the Optimize rest api or the historic detail endpoint
        * then:
        ** it does not take more than 2 seconds
        * such that:
        ** Even if the user has a lot of complex variables, Optimize can import the data in a very fast manner

        *Hints:*
        In the context of bulk fetching historic variable updates from the historic detail table the current implementation of e.g. HistoricDetailQueryImpl#executeList and OptimizeHistoricVariableUpdateQueryCmd#fetchVariableValues sequentially calls getTypedValue for each variable entry which will perform one additional query per complex/object variable in AbstractSerializableValueSerializer#readValue to resolve the actual value.

        This doesn't scale with maxResults values in the magnitude of several thousand as used by optimize in the context of importing data e.g.
        maxResults -> response time
        100 -> 400ms
        200 -> 640ms
        500 -> 1s
        1000 -> 2s
        10_000 -> 20s

        We need a better scaling implementation of fetching variables for this usecase that e.g. fetches all byteArray entries in one bulk.
        felix.mueller Felix Müller made changes -
        Remote Link This issue links to "Page (camunda confluence)" [ 12472 ]
        roman.smirnov Smirnov Roman made changes -
        Fix Version/s 7.11.0 [ 15343 ]
        Fix Version/s 7.10.2 [ 15351 ]
        Fix Version/s 7.9.9 [ 15352 ]
        Fix Version/s 7.8.14 [ 15358 ]
        roman.smirnov Smirnov Roman made changes -
        Issue Type Feature Request [ 2 ] Task [ 3 ]
        roman.smirnov Smirnov Roman made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        roman.smirnov Smirnov Roman made changes -
        Status In Progress [ 3 ] Resolved [ 5 ]
        Original Estimate 0 minutes [ 0 ]
        Remaining Estimate 0 minutes [ 0 ]
        Assignee Tassilo Weidner [ tassilo.weidner ]
        Resolution Fixed [ 1 ]
        roman.smirnov Smirnov Roman made changes -
        Remote Link This issue links to "Page (camunda confluence)" [ 12472 ]
        tassilo.weidner Tassilo Weidner made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Assignee Tassilo Weidner [ tassilo.weidner ]
        thorben.lindhauer Thorben Lindhauer made changes -
        Fix Version/s 7.11.0-alpha1 [ 15370 ]
        thorben.lindhauer Thorben Lindhauer made changes -
        Workflow camunda BPM [ 54238 ] Backup_camunda BPM [ 64186 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            sebastian.bathke Sebastian Bathke
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development