New home › Forums › Pro Add-on › Troubleshooting › Import Bug Solution & a Couple of Questions
Tagged: CSV import, custom field import, import, slow import
- This topic has 5 replies, 3 voices, and was last updated 1 year, 1 month ago by VR51.
-
AuthorPosts
-
October 31, 2021 at 3:47 pm #71010VR51Participant
I have suffered a lot of failed and partial imports from Google Sheets. These failures occur when the import is via the Gsheets URL.
The problems are that
- Often the import times out
- Sometimes WPGMaps is unable to detect the CSV header row in the imported CSV file
- sometimes the old data is not cleanly deleted from the map and I get duplicates
The import usually contains around 480 rows of data. This data contains a mix of regular map fields, a category field column and several map custom fields (created natively with WPGMaps and not a custom fields plugin).
In attempt to solve issue 1 I increased the script run timeouts for both our database server and our PHP server. I needed to set a timeout of 20 minutes, which is excessive yet we still hit issues with the import timing out. The timeout happens whether we import via a file upload or via URL.
The solution I arrived at is to split the import into 5 separate files with 100 rows of data plus the CSV header row. This solved issue number 1 but there is a cron timing problem here that I will come to in a moment.
This still leaves me with issue 2: the CSV header is not always detected when we schedule import to happen via GSheet URLs. This causes all sorts of issues with the import. Sometimes the import restarts and the header row is detected but then the old map data is not wiped (I have question coming up for you about the need to wipe old data too).
To solve issue 2 I wrote a script to periodically download the CSV files from GSheets. WPGMaps is configured to import these files after they have been downloaded by my script. Problem solved. The header row is now detected.
I should not need to write a script to download files from a 3rd party service so that WPGMaps can then successfully import those files. WPGMaps should download files before it tries to import them. My script is available here at GitHub (modified from my original to make it generic. I was tired when I modified it and have not tested the modified version but you will get the gist of it): https://gist.github.com/VR51/b6581a1e68c3bc32cf7d1c056d839c8c
Please consider using a similar download function in WPGMaps to improve the remote file (URL) import process and thereby reduce the number of failed imports caused by failure to detect the header row in a file.
Is there a way to speed up the import of WPGMaps custom field data?
It appears to me that WPGMaps takes an extraordinarily long time to import data from files that contain columns for custom field data. When custom field data is not present in import files my map file import of ~480 rows finishes within 5 to 10 minutes. As soon as custom field data is present in the file the import takes 40 to 50 minutes to complete. There seems to be an issue with the first part of the import process that reads the file data. I assume this is the data-check process that occurs first as data is loaded into the database. The Google geolocation requests happen quite fast.
Data Update verse Data Wipe for Fresh Import
Is there a way to import map data to update old data without need to wipe the existing data first?
My data file contains an id column. Each row has a unique id. This id is associated with the map location. This id never changes. Can we match new data rows to old data rows by comparing id column numbers and then just update changed data?
Every time I import data into my map I have wipe the old data otherwise I get data duplication. What starts as 480 map locations suddenly becomes 960 map locations then 1440 then 1920 etc…
Can you please add to the next release of WPGMaps Pro a method for comparing data and a button on the import page that says ‘Update’ so that we have then the options of ‘Import’, ‘Update’ and ‘Schedule’.
Please Add a Start Time to the Scheduler
Please take a look at my fork of Import Users from CSV. You will see that around line 509 of the plugin script there is an input field that allows people to set a start time for automated import schedules. Please can you add a similar input field to WPGMaps so that import can be more accurately timed. The code is here (line 509) https://github.com/VR51/import-users-from-csv/blob/master/import-users-from-csv.php
The problem I now have is that I have configured 5 import schedules. The first scheduled import deletes the old map data. All subsequent 4 scheduled imports are configured to only add extra data to the map. If these events occur at the wrong times I could end up with less data in the map than anticipated if the first import happens out of sequence or duplicate data if the first import fails to happen at all.
I’ve used a Cron Manager plugin to configure the scheduled import times but I feel it is such an easy task to add a start time field to the maps scheduler that this should be added to the import form.
To Summarise
1. Please improve the file import process so that data collected from 3rd party hosts is downloaded to a file before it is imported by WPGMaps. This prevents missing header row errors.
2. Please improve the import process so that custom field data imports quickly. This will reduce timeout errors.
3. Please make it possible to continue an import if it fails such that the import picks up from where it left off.
4. Please make it possible to ping scheduled imports so that a failed import can be restarted from where it left off.
5. Please add a ‘run now’ button to scheduled imports so they can be initiated without going through the edit import steps.
6. Please add an Update Data feature to the import process. This will allow imports to take place without existing map data being wiped first.
7. Please add a way to turn off backups. I think map data backups might be partially responsible for import timeouts.
8. Please add a way to delete import logs. I think long import logs might be the reason the Advanced settings page takes longer to load than it should do.
My particular server runs Apache, NginX and MariaDB, has 32 CPU cores and 64GB RAM. The server has been tuned for speed. I keep adjusting the database server to improve the WPGMaps import speed but nothing seems to make much difference.
I know I’ve written a lot above. I’ve been storing up my issues and suggestions ready to write this forum post while I work on other tasks.
I like the plugin but the import process is a bit choppy.
Thank you,
LeeNovember 10, 2021 at 10:57 pm #71157VR51ParticipantDo we have any feedback on the above yet? Any solutions to the slow import process?
November 15, 2021 at 11:34 pm #71235brucekbfParticipantI’m also hoping to get some answers to this issue. We are running into incomplete imports regularly, which is made more frustrating by the fact that the “ID” field doesn’t seem to do anything. Instead of being able to attempt the import again and have it skip what it’s already uploaded, it duplicates what it’s already done and fails again.
It would be great if you could select whether to “update existing records”, “ignore existing and upload new records” or “overwrite all”. For instance, I have a database of locations that occasionally has updates. Ideally it would skip all data that has remained the same, append new data to existing records (unified by ID number), and add new markers where there are new IDs and data.
November 16, 2021 at 10:25 am #71241StevenModeratorHi there,
Thank you so much for getting in touch, we really appreciate your time.
@VR51, I have kindly taken note of your mentioned requests and will kindly be adding these to our suggestions log for an update of WP Google Maps.
As for import speed, this is unfortunately, not determined by the plugin as there is nothing additional being done for this import but instead this is determined by the server and quantity of data. For large amount os import data, we usually suggest users batch the files into smaller files.
@brucekbf, @VR51, When importing data, presently all data needs to first be deleted and then replaced with the new data. This should be theoretically quicker as this simply requires a delete query and then an insert query whereas, if the plugin was to replace, the queries would need to be adjusted to first go through the existing data, compare and then only replace when certain conditions are met.
This is something we are looking into to ensure this can be done with the best performance. This will most likely require a few changes along with some additional settings that will allow a user to replace marker data according to the ID if the ID exists in the sheet.
With that said, we are presently working on the new major version of WP Google Maps which we are hoping to tend to as many of your mentioned points.
For any further information and assistance, kindly get in touch with us directly via our support desk as our agents will better be able to assist with more accurate and immediate information.
My sincerest apologies for the inconvenience.
Kindest Regards,
Steven De BeerNovember 17, 2021 at 11:59 pm #71271VR51ParticipantThanks for your reply. I might submit a support request too. For now I will add some extra details here.
I notice that the import spends five minutes loading the ‘base constructor’. I have no idea where that construct might be so I will leave that to you guys to work on.
What seems to happen is that an import starts then 5 minutes are taken for the base constructor to load then the import restarts and another 5 minutes are spent loading the base constructor before the import enters the geolocation step.
Separately, I notice the maximum import script run time is hardcoded as 60*15 so 15 minutes.
See the protected function attempt_set_time_limit found at line 201 in file wp-google-maps-pro/includes/import-export/class.import.php:
protected function attempt_set_time_limit() { $desired_time_limit = 60 * 15; if(function_exists('set_time_limit')) set_time_limit($desired_time_limit); if(function_exists('get_time_limit')) ini_set('max_execution_time', $desired_time_limit); }
This hardcoded time limit means the PHP max_execution_time is ignored/overruled. The code should check whether the server max_execution_time is higher than the desired time limit and choose to use whichever is higher.
See my edited version of this code:
protected function attempt_set_time_limit() { if(function_exists('ini_get')) { $server_time_limit = (int)ini_get('max_execution_time'); } else { $server_time_limit = 30; } $desired_time_limit = 60 * 15; if($server_time_limit > $desired_time_limit) $desired_time_limit = $server_time_limit; if(function_exists('set_time_limit')) set_time_limit($desired_time_limit); if(function_exists('get_time_limit')) ini_set('max_execution_time', $desired_time_limit); }
I might open a support ticket in a few days, depends how busy I am.
Lee / VR51
- This reply was modified 1 year, 4 months ago by VR51.
Attachments:
You must be logged in to view attached files.February 7, 2022 at 7:44 pm #72215VR51ParticipantThese issues still need to be fixed and the protected function needs to be rewritten in file wp-google-maps-pro/includes/import-export/class.import.php.
-
AuthorPosts
- You must be logged in to reply to this topic.