OroPlatform Forums

Covering OroPlatform topics, including community updates and company announcements.

Forums Forums OroPlatform Reports generating .csv with duplicated data

This topic contains 7 replies, has 4 voices, and was last updated by  hurricane 8 years, 2 months ago.

Starting from March 1, 2020 the forum has been switched to the read-only mode. Please head to StackOverflow for support.

  • Creator
    Topic
  • #36852

    Rodolfo
    Participant

    Hello,

    In Oro-grid I don’t have any duplicated data when generating some custom reports. But when generating .csv files, I got some duplicated entities. Anyone noticed the same problem?

    I’m generating .csv files with > 20k rows.

    Thanks!

Viewing 7 replies - 1 through 7 (of 7 total)
  • Author
    Replies
  • #36853

    Dima Soroka
    Keymaster

    Hi @Rodolfo

    Could you please describe your report structure so we’ll validate the issue?

    Thanks

    #36854

    Rodolfo
    Participant

    Hi @Soroka,

    I’m getting duplicated data in all reports with a considerable number of rows. (10k, 30k, 100k rows) This is an example based on Order Entity from Magento Channel:

    Other problem is depending of the size of the exported .csv file, my nginx + php-fpm are going down. (I believe the code are creating multiple connections with the server)

    Thanks for all effort.
    Rodolfo

    #36855

    Jaimy
    Participant

    Hi @rbandeira && @Soroka,

    We’ve experienced the same issue with a Magento channel, but we think we have figured it out what the cause is and how to over come it.

    “The problem”
    Magento orders have two addresses as you might know, the shipping- and billing address. So whenever you are using fields from Order -> Addresses -> “<FIELD>” it is using a LEFT JOIN to get the data. This LEFT JOIN will get you “duplicate” rows, which aren’t actually duplicate since you didn’t know this would happen. To verify my statement above, you can create a custom report with the following fields (use the Order entity):

    • Order / Order #
    • Order / Order items > Sku
    • Order / Addresses > Country

    I’ve added a few field conditions, which are a specific SKU and a time period (Order created at) because I’ve to much data and it’s faster :).
    After you’ve created the report you’ll see a nice grid and see no duplicate rows..But…
    you could run the following query in phpMyAdmin or something similar to verify you’re query:SELECT
    o0_.increment_id AS increment_id0,
    o1_.sku AS sku1,
    o2_.name AS name2,
    o0_.id AS id3,
    o3_.label AS label4
    FROM
    orocrm_magento_order o0_
    LEFT JOIN orocrm_magento_order_items o1_ ON o0_.id = o1_.order_id
    LEFT JOIN orocrm_magento_order_address o4_ ON o0_.id = o4_.owner_id
    LEFT JOIN oro_dictionary_country o2_ ON o4_.country_code = o2_.iso2_code
    LEFT JOIN oro_workflow_step o3_ ON o0_.workflow_step_id = o3_.id
    WHERE
    o1_.sku = 1000
    AND (
    o0_.created_at >= '2015-01-31 23:00:00'
    AND o0_.created_at <= '2015-02-23 23:00:00'
    )
    AND o0_.organization_id = 1
    ORDER BY
    increment_id0 ASC
    LIMIT
    25 OFFSET 0

    This is the query the report will run :). If you don’t have the opportunity to get the query, you should export the grid via “Export Grid” button and check your CSV file. Again you will see duplicate rows.

    So why do you not see this in the Datagrid? We’ll correct me if I’m wrong @Soroka, but if I’ve read it correctly the default value for the datagrids, at least in the yml files, the query is executed by “SELECT DISTINCT” by default. This explains why you wouldn’t see the duplicate rows in the datagrid. Again I remember reading this somewhere along some bundle, but please correct me if I’m wrong.

    So their you have your problem, but how to over come this?

    Well the “Solution” is:
    Add an extra field condition to your report. Which field condition? Well add a Order Address Type field condition; this can be found in Order / Order Addresses / Types > Label this should be set to billing or shipping, your choice.

    Or if you have a Order Increment Id in your Column list, you could group by Order Increment Id (this helps only if you have the Order increment id in your column list). I prefer the first solution since you are able to apply this technique/thingy to other reports which contain Order Addresses fields.

    Hope this helps other people and @Soroka saves a lot of research time to get to the bottom of this ;).

    Kind Regards,

    Jaimy Casteleijn

    #36856

    Dima Soroka
    Keymaster

    Thanks @Jaimy, this is the case. We are adding SQL output to the report page (this should simplify debugging) and will fix grid query.

    #36857

    Jaimy
    Participant

    Hi @Soroka,

    Not a problem! Is there a reference for default values on grids etc so we can educate our clients on what is happening since if we (as developers) have the greatest trouble getting to the bottom of this, the clients a.k.a the end user will certainly have trouble figuring this out. They usually do not have the knowledge nor the ability to fully debug such problems.

    It would be nice to use our knowledge of what is “potentially” happening, to help and create clear directions on what clients should look after when creating such reports.

    Kind Regards,

    Jaimy Casteleijn

    #36858

    Rodolfo
    Participant

    Hello @soroka and @jaimy

    Thank you so much for all the effort. I’m pretty sure that all these challenges we’re facing will help to improve the OroCRM.

    Best Regards,
    Rodolfo

    #36859

    hurricane
    Participant

    Hi @rbandeira and @jaimy,

    As for now issue with duplicating entities is not reproducing.
    Also we have added SQL output on the report view page so you can easily debug report query (by default its enabled for role administrator).

    Best regards,
    Denis Voronin

Viewing 7 replies - 1 through 7 (of 7 total)

The forum ‘OroPlatform’ is closed to new topics and replies.

Back to top