Hello,

i have the following query. I have a txt file with data like this:

1 observational study
1.1 cohort study
1.1.1 retrospective cohort study
1.1.2 prospective cohort study
1.2 cross-sectional study

And another file with data like this:

cross-sectional survey 12345.txt
retrospective study 2345.txt
...

I want to do an appropriate string matching. To be more specific i want to read each line of the second file and find the one that is similar (or looks kinda the same) from the first file. So for the first line of the second file the "cross-sectional survey" will be assigned to "1.2 cross-sectional study".

Is there any way of doing this? :-/

Split() each record. Don't use the first element of the list, join([1:]), in the first file. And, similarly, don't use the last element in the second file.

Edited 5 Years Ago by woooee: n/a

not sure how is going to work this with thesplit command. I want to compare the strings of the second file with those of the first, and from the comparison i want to get the ones that look similar to the lines of the second file.

For example "retrospective study" is similar to "retrospective cohort study" thus the comparison will return "retrospective cohort study" as the most similar string for the line of the second file "retrospective study" :/

I will try your function and will let you know how it goes. Is there any module that can do this?

If you read the messages of the thread you can see that there is module difflib module which has connected functionality.

Edited 5 Years Ago by pyTony: n/a

This article has been dead for over six months. Start a new discussion instead.