Wikipedia talk:CSVLoader
Replacing links in a list of articles using Find & Replace
Hi,
Thanks for the great job. I used it last year for creating redirects and it was a great help.
What I need to do now is to replace some bad links in some articles with the good one. So in each line of my CSV file I have page_title,old_link, new_link and in Find & Replace window I gave ##old_link## for Find and ##new_link## for Replace with. I get no error, but at the end replacement doesn't occur. It is surprising that after starting and opening a page (with No changes), if I open Find & Replace window I see the correct links loaded from the CSV file for the current page, meaning that everything goes well except for just replacing. Could you please help me with this? Saeidpourbabak (talk) 21:46, 23 October 2018 (UTC)
- Hi @Saeidpourbabak:, the find and replace function seems to be broken in the current and new versions. I will troubleshoot the issue and get back to you. Thanks for the positive feedback. — Ganeshk (talk) 00:54, 24 October 2018 (UTC)
- @Saeidpourbabak: I have released a fix for the Find and Replace issue. Please download the new V24 DLL from here and let me know if it works for you. Also see instructions in the above section to unlock the DLL after download. — Ganeshk (talk) 02:26, 7 January 2019 (UTC)
- @Ganeshk: Hello again. I managed to narrow down the issue: it does not "see" the parentheses when seeking for a text, e.g. when in the CSV file you have "foo1 (foo2)" in the Find field, it looks for "foo1 foo2" in the wiki page. In case there is no parentheses in the find field, it work properly. Saeidpourbabak (talk) 21:24, 27 January 2019 (UTC)
- Example: intending to replace دو (فیلم) with دو (فیلم ۱۳۹۳), this replacement (دو (فیلم ۱۳۹۳) او در → دو فیلم او در) has been made, showing that CSVLoader ignore parenthesis in the "find" field when replacing. Saeidpourbabak (talk) 10:13, 9 March 2019 (UTC)
CSVLoader
Hi Balaji, thanks for your update on the CSVLoader page. I am always glad to hear that the plugin continues to be helpful. It has been 10 years since I developed the tool. Can you explain in more detail on how you used the tool on Tamil Wikisource? Thanks, — Ganeshk (talk) 22:16, 12 September 2018 (UTC)
- @Ganeshk: Thanks for responding here in talk page. You have done awesome work by creating this csvupload addon. @Info-farmer: taught me how to use this CSVuploader plugin. I use it in tamil wikisource for making changing changes in the header section. For example. For creating transclusion pages. Transclusion pages shows the content which was proof read in the page namespace to main name space. Example. ta:s:கடவுள் வழிபாட்டு வரலாறு/ஊழ் வினை. First I would prepare a list in CSV. Then would use the plugin and create pages or find and replace. I have some doubts in Regex while using find and replace. Can you help me? -- Balaji (Let's talk) 00:25, 16 September 2018 (UTC)
- Great, thanks for the detailed note. What are your doubts on Regex? — Ganeshk (talk) 12:17, 16 September 2018 (UTC)
- @Ganeshk:Wikisource in the proofread namespace uses the structures and Header, body and footer. The header in the books are moved to the header section. After OCR is done for the image, the whole content in that page is placed in the body. Now i create a CSV list of header information. For example in page number 478 of the following book, i ask to search for "குரல்" then replace the content from "</noinclude>.*?குரல்" and replace with {{rh|'''476'''||'''அறத்தின் குரல்'''}}</noinclude>. பக்கம்:மகாபாரதம்-அறத்தின் குரல்.pdf/478~/s</noinclude>.*?குரல்~{{rh|'''476'''||'''அறத்தின் குரல்'''}}</noinclude> . Now if the word "குரல்" is in the first line of the page it is finding and replacing properly. But if the word "குரல்" is in the second or other lines, then find and replace is not working. Even i selected multiline option also in find and replace option. But still not working. Is there a way i can change my search regex option such that i can find the search term spanning multiple lines? If the description is not clear please let me know. link for the page mentioned in the example. -- Balaji (Let's talk) 05:30, 17 September 2018 (UTC)
- Have not forgotten this, I will check this out shortly. — Ganeshk (talk) 10:08, 21 September 2018 (UTC)
- @Ganeshk:Wikisource in the proofread namespace uses the structures and Header, body and footer. The header in the books are moved to the header section. After OCR is done for the image, the whole content in that page is placed in the body. Now i create a CSV list of header information. For example in page number 478 of the following book, i ask to search for "குரல்" then replace the content from "</noinclude>.*?குரல்" and replace with {{rh|'''476'''||'''அறத்தின் குரல்'''}}</noinclude>. பக்கம்:மகாபாரதம்-அறத்தின் குரல்.pdf/478~/s</noinclude>.*?குரல்~{{rh|'''476'''||'''அறத்தின் குரல்'''}}</noinclude> . Now if the word "குரல்" is in the first line of the page it is finding and replacing properly. But if the word "குரல்" is in the second or other lines, then find and replace is not working. Even i selected multiline option also in find and replace option. But still not working. Is there a way i can change my search regex option such that i can find the search term spanning multiple lines? If the description is not clear please let me know. link for the page mentioned in the example. -- Balaji (Let's talk) 05:30, 17 September 2018 (UTC)
- Great, thanks for the detailed note. What are your doubts on Regex? — Ganeshk (talk) 12:17, 16 September 2018 (UTC)
CSVLoader appends instead of replace
Hello. I use AWB v6.1.0.1. I want to make some find and replace using CSVLoader; I want to change the input of |link=
in {{Taxonomy}} template subpages so it will use correct titles, to be specific. To get this, I made a replace rule to replace link=(.*?)
with link=##correcttitle##
. I enabled CSVLoader, imported my CSV file, defined the header and started the AWB, so the expected result, for example, is to replace link=Aaadonta
with link=آدونتا
(the article's title in fawiki), but instead, it returns link=آآدونتاAaadonta
. I can't really figure out why this happens, can anyone help? Thanks. Ahmadtalk 12:42, 20 September 2019 (UTC)
- Hi @Ahmad252:, I will look into this. — Ganeshk (talk) 17:04, 20 September 2019 (UTC)
- Hi Ganeshk, any update on this? Thanks. Ahmadtalk 21:51, 18 October 2019 (UTC)
- Hi @Ahmad252:, your find condition should be link=##wrongtitle## and replace condition will be link=##correcttitle##. You must have the English title in the CSV file as one of the columns, ##wrongtitle##. CSVLoader does a simple find and replace. It does not support regex conditions. — Ganeshk (talk) 17:50, 1 March 2020 (UTC)
My replacement string is always just ##field##, not the relevant data from the CSV file
Hi. I'm using CSVLoader for the first time. I am using it on Wikisource, where I have hundreds of headers to replace, so it looks like the ideal tool for that. I have a CSV file - I chose format "CSV (MS-DOS)" in Excel - which is 2 columns wide. There are no column headings in the file itself - the first data row is row 1. The first column is a series of 368 article names. The second column contains the correct fully formed header for each specific article and I have a single, very simple search/replace, which is:
Search:
<noinclude><pagequality level="1" user="Mjbot" /><div class=pagetext>{{sidenotes begin}}{{USStatHeader | side = | volume= 33 | congress=Fifty-Eighth | congress word=FIFTY EIGHTH| session=2nd | chapter= | year=1904 | page= }}</noinclude>
Replace:
##field##
Unfortunately, in the article defined at A1, instead of replacing the search field with the contents of cell B1, it replaces it with the actual text ##field##. If I click skip, it then again attempts to replace the next matching header with the string ##field##. I have also tried the replacement field as [[##field##]], but that is then used as the actual replacement text instead. In the Column Headers box in CSVLoader I have ##article##,##field##. I do not have Regex checked in the search/replace box. When I do, the replace text is often multiple instances of ##field##.
My AWB version is 5.10.1.0, my CSVLoader version is 1.0.0.18 (I think...)
Is anybody able to help, please? Thank you. CharlesSpencer (talk) 12:13, 8 October 2019 (UTC)
- Update - same behaviour using 6.1.0.1 and 1.0.0.24. I am obviously doing something (probably simple) very wrong!! CharlesSpencer (talk) 12:49, 8 October 2019 (UTC)
- Hi @CharlesSpencer: I tried this on my Sandbox and was able to get it to work. Are you selecting the Find and Replace check box on the Plugin window (this is the key)? Easier if you had 3 columns, ##title##, ##find##, ##replace##. That way it is easy to enter them in the Find and Replace settings window on AWB. You can also store the settings for later use. — Ganeshk (talk) 19:09, 1 March 2020 (UTC)
- My column headers: ##title##~##find##~##replace##
- My CSV file is here:
User:Ganeshk/sandbox~<noinclude><pagequality level="3" user="CharlesSpencer" /><div class=pagetext>{{sidenotes begin}}{{USStatHeader | side = left | volume= 33 | congress=Fifty-Eighth | congress word=FIFTY EIGHTH| session=2nd | chapter=1120-1124 | year=1904 | page=1524}}</noinclude>~<noinclude><pagequality level="3" user="Ganeshk" /><div class=pagetext>{{sidenotes begin}}{{USStatHeader | side = left | volume= 33 | congress=Fifty-Eighth | congress word=FIFTY EIGHTH| session=2nd | chapter=1120-1124 | year=1904 | page=1524}}</noinclude>
CSVLoader
Hi, I've opened a task at Phabricator. --Hispano76 (talk) 17:12, 29 February 2020 (UTC)
- Moved this discussion from my talk page to here. — Ganeshk (talk) 17:19, 1 March 2020 (UTC)
- Hi @Hispano76:, I have released a new version, 1.0.0.25 to fix the issue. Please try and let me know if that fixed it. — Ganeshk (talk) 19:41, 1 March 2020 (UTC)