Forums › Forums › OroPlatform › Reports generating .csv with duplicated data
This topic contains 7 replies, has 4 voices, and was last updated by hurricane 8 years, 2 months ago.
Starting from March 1, 2020 the forum has been switched to the read-only mode. Please head to StackOverflow for support.
- CreatorTopic
- January 22, 2015 at 11:41 am #36852
Hello,
In Oro-grid I don’t have any duplicated data when generating some custom reports. But when generating .csv files, I got some duplicated entities. Anyone noticed the same problem?
I’m generating .csv files with > 20k rows.
Thanks!
- CreatorTopic
- AuthorReplies
- February 10, 2015 at 10:29 am #36853
Hi @Rodolfo
Could you please describe your report structure so we’ll validate the issue?
Thanks
February 10, 2015 at 10:46 am #36854Hi @Soroka,
I’m getting duplicated data in all reports with a considerable number of rows. (10k, 30k, 100k rows) This is an example based on Order Entity from Magento Channel:
Other problem is depending of the size of the exported .csv file, my nginx + php-fpm are going down. (I believe the code are creating multiple connections with the server)
Thanks for all effort.
RodolfoFebruary 24, 2015 at 11:34 am #36855Hi @rbandeira && @Soroka,
We’ve experienced the same issue with a Magento channel, but we think we have figured it out what the cause is and how to over come it.
“The problem”
Magento orders have two addresses as you might know, the shipping- and billing address. So whenever you are using fields from Order -> Addresses -> “<FIELD>” it is using a LEFT JOIN to get the data. This LEFT JOIN will get you “duplicate” rows, which aren’t actually duplicate since you didn’t know this would happen. To verify my statement above, you can create a custom report with the following fields (use the Order entity):- Order / Order #
- Order / Order items > Sku
- Order / Addresses > Country
I’ve added a few field conditions, which are a specific SKU and a time period (Order created at) because I’ve to much data and it’s faster :).
After you’ve created the report you’ll see a nice grid and see no duplicate rows..But…
you could run the following query in phpMyAdmin or something similar to verify you’re query:SELECT
o0_.increment_id AS increment_id0,
o1_.sku AS sku1,
o2_.name AS name2,
o0_.id AS id3,
o3_.label AS label4
FROM
orocrm_magento_order o0_
LEFT JOIN orocrm_magento_order_items o1_ ON o0_.id = o1_.order_id
LEFT JOIN orocrm_magento_order_address o4_ ON o0_.id = o4_.owner_id
LEFT JOIN oro_dictionary_country o2_ ON o4_.country_code = o2_.iso2_code
LEFT JOIN oro_workflow_step o3_ ON o0_.workflow_step_id = o3_.id
WHERE
o1_.sku = 1000
AND (
o0_.created_at >= '2015-01-31 23:00:00'
AND o0_.created_at <= '2015-02-23 23:00:00'
)
AND o0_.organization_id = 1
ORDER BY
increment_id0 ASC
LIMIT
25 OFFSET 0This is the query the report will run :). If you don’t have the opportunity to get the query, you should export the grid via “Export Grid” button and check your CSV file. Again you will see duplicate rows.
So why do you not see this in the Datagrid? We’ll correct me if I’m wrong @Soroka, but if I’ve read it correctly the default value for the datagrids, at least in the yml files, the query is executed by “SELECT DISTINCT” by default. This explains why you wouldn’t see the duplicate rows in the datagrid. Again I remember reading this somewhere along some bundle, but please correct me if I’m wrong.
So their you have your problem, but how to over come this?
Well the “Solution” is:
Add an extra field condition to your report. Which field condition? Well add a Order Address Type field condition; this can be found in Order / Order Addresses / Types > Label this should be set to billing or shipping, your choice.Or if you have a Order Increment Id in your Column list, you could group by Order Increment Id (this helps only if you have the Order increment id in your column list). I prefer the first solution since you are able to apply this technique/thingy to other reports which contain Order Addresses fields.
Hope this helps other people and @Soroka saves a lot of research time to get to the bottom of this ;).
Kind Regards,
Jaimy Casteleijn
February 24, 2015 at 4:45 pm #36856Thanks @Jaimy, this is the case. We are adding SQL output to the report page (this should simplify debugging) and will fix grid query.
February 25, 2015 at 12:24 am #36857Hi @Soroka,
Not a problem! Is there a reference for default values on grids etc so we can educate our clients on what is happening since if we (as developers) have the greatest trouble getting to the bottom of this, the clients a.k.a the end user will certainly have trouble figuring this out. They usually do not have the knowledge nor the ability to fully debug such problems.
It would be nice to use our knowledge of what is “potentially” happening, to help and create clear directions on what clients should look after when creating such reports.
Kind Regards,
Jaimy Casteleijn
February 25, 2015 at 12:59 pm #36858Hello @soroka and @jaimy
Thank you so much for all the effort. I’m pretty sure that all these challenges we’re facing will help to improve the OroCRM.
Best Regards,
RodolfoFebruary 16, 2016 at 4:09 am #36859Hi @rbandeira and @jaimy,
As for now issue with duplicating entities is not reproducing.
Also we have added SQL output on the report view page so you can easily debug report query (by default its enabled for role administrator).Best regards,
Denis Voronin - AuthorReplies
The forum ‘OroPlatform’ is closed to new topics and replies.