0

Hi folks,

I wonder if anyone out there could help me with something.
I have a folder containing many cnv files that look like this:

* Sea-Bird SBE 9 Data File:
* FileName = C:\CTD Data\Alg173\stn001.dat
* Software Version Seasave Win32 V 5.38
* Temperature SN = 4977
* Conductivity SN = 3436
* Number of Bytes Per Scan = 27
* Number of Voltage Words = 5
* Number of Scans Averaged by the Deck Unit = 1
* System UpLoad Time = Oct 04 2009 05:18:31
** Ship: Algoa
** Cruise: ACEP II
** Station: C09412
** Latitude: 26 23.607 S
** Longitude: 032 57.018 E
** Transect 1 - Mozambique
** Grazing
# nquan = 16
# nvalues = 31
# units = specified
# name 0 = prDM: Pressure, Digiquartz [db]
# name 1 = depSM: Depth [salt water, m]
# name 2 = t068C: Temperature [ITS-68, deg C]
# name 3 = t090C: Temperature [ITS-90, deg C]
# name 4 = potemp090C: Potential Temperature [ITS-90, deg C]
# name 5 = c0S/m: Conductivity [S/m]
# name 6 = sal00: Salinity [PSU]
# name 7 = sbeox0ML/L: Oxygen, SBE 43 [ml/l]
# name 8 = flECO-AFL: Fluorescence, Wetlab ECO-AFL/FL [mg/m^3]
# name 9 = par: PAR/Irradiance, Biospherical/Licor
# name 10 = spar: SPAR/Surface Irradiance
# name 11 = v1: Voltage 1
# name 12 = sal00: Salinity [PSU]
# name 13 = svCM: Sound Velocity [Chen-Millero, m/s]
# name 14 = sigma-é00: Density [sigma-theta, Kg/m^3]
# name 15 = flag: flag
# span 0 = 5.029, 35.213
# span 1 = 5.000, 35.000
# span 2 = 20.9034, 21.8131
# span 3 = 20.8983, 21.8079
# span 4 = 20.8916, 21.8059
# span 5 = 4.940027, 5.033821
# span 6 = 35.4444, 35.4573
# span 7 = 4.35836, 4.67184
# span 8 = 0.3789, 0.5774
# span 9 = 3.5226e+01, 2.4172e+02
# span 10 = 9.8951e+02, 1.5403e+03
# span 11 = 0.0626, 0.0702
# span 12 = 35.4443, 35.4533
# span 13 = 1525.01, 1527.07
# span 14 = 24.6142, 24.8617
# span 15 = 0.0000e+00, 0.0000e+00
# interval = meters: 1
# start_time = Oct 04 2009 05:18:31
# bad_flag = -9.990e-29
# sensor 0 = Frequency 0 temperature, 4977, 17/04/2008
# sensor 1 = Frequency 1 conductivity, 3436, 15/04/2008, cpcor = -9.5700e-08
# sensor 2 = Frequency 2 pressure, 89112, 21/5/2003
# sensor 3 = Extrnl Volt 0 Oxygen, SBE, primary, 0591, 19/11/03
# sensor 4 = Extrnl Volt 1 WET Labs, ECO_AFL
# sensor 5 = Extrnl Volt 2 userpoly 0, BBRTD-385R, 20/07/2007
# sensor 6 = Extrnl Volt 3 transmissometer, primary, CST-970DR, 06/12/2006
# sensor 7 = Extrnl Volt 4 backscatterance, 2355, 29/10/2003
# sensor 8 = Extrnl Volt 5 irradiance (PAR), primary, 70168, 17/03/2008
# sensor 9 = Extrnl Volt 6 altimeter
# sensor 10 = Extrnl Volt 9 surface irradiance (SPAR), degrees = 0.0
# datcnv_date = Oct 07 2009 05:23:59, 7.16
# datcnv_in = D:\Alg 173\CTD Data\Alg173_Raw CTD data\stn001.dat D:\Alg 173\CTD Data\Alg173_Raw CTD data\algoa_0746_ACEP_072009.con
# datcnv_skipover = 0
# wildedit_date = Oct 07 2009 05:26:26, 7.16
# wildedit_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# wildedit_pass1_nstd = 2.0
# wildedit_pass2_nstd = 10.0
# wildedit_pass2_mindelta = 0.000e+000
# wildedit_npoint = 100
# wildedit_vars = prDM c0S/m sal00 sbeox0ML/L
# wildedit_excl_bad_scans = yes
# celltm_date = Oct 07 2009 05:29:16, 7.16
# celltm_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# celltm_alpha = 0.0300, 0.0000
# celltm_tau = 7.0000, 0.0000
# celltm_temp_sensor_use_for_cond = primary,
# filter_date = Oct 07 2009 05:32:55, 7.16
# filter_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# filter_low_pass_tc_A = 0.030
# filter_low_pass_tc_B = 0.150
# filter_low_pass_A_vars = prDM c0S/m
# filter_low_pass_B_vars =
# loopedit_date = Oct 07 2009 05:36:22, 7.16
# loopedit_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# loopedit_minVelocity = 0.250
# loopedit_surfaceSoak: do not remove
# loopedit_excl_bad_scans = yes
# Derive_date = Oct 07 2009 05:39:02, 7.16
# Derive_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv D:\Alg 173\CTD Data\Processed data\algoa_0746_ACEP_072009.con
# binavg_date = Oct 07 2009 05:42:22, 7.16
# binavg_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# binavg_bintype = meters
# binavg_binsize = 1
# binavg_excl_bad_scans = yes
# binavg_skipover = 0
# binavg_surface_bin = no, min = 0.000, max = 0.000, value = 0.000
# file_type = ascii
*END*
5.029 5.000 21.8056 21.8003 21.7994 5.032665 35.4488 4.65753 0.4282 2.4172e+02 9.8951e+02 0.0645 35.4489 1526.92 24.6151 0.0000e+00
6.039 6.000 21.8092 21.8039 21.8027 5.033160 35.4494 4.65977 0.4554 1.9693e+02 1.0206e+03 0.0655 35.4495 1526.94 24.6147 0.0000e+00
7.045 7.000 21.8115 21.8063 21.8049 5.033505 35.4499 4.66138 0.4422 1.9115e+02 1.0276e+03 0.0650 35.4500 1526.97 24.6144 0.0000e+00
8.052 8.000 21.8119 21.8067 21.8051 5.033587 35.4498 4.66450 0.4545 1.7034e+02 1.0387e+03 0.0655 35.4499 1526.99 24.6143 0.0000e+00
9.059 9.000 21.8121 21.8069 21.8051 5.033646 35.4499 4.66610 0.4633 1.6056e+02 1.0427e+03 0.0658 35.4499 1527.00 24.6143 0.0000e+00
10.067 10.000 21.8131 21.8079 21.8059 5.033821 35.4500 4.67016 0.4665 1.5411e+02 1.0422e+03 0.0659 35.4501 1527.02 24.6142 0.0000e+00
11.072 11.000 21.8123 21.8070 21.8048 5.033758 35.4499 4.67162 0.4667 1.4018e+02 1.1178e+03 0.0660 35.4499 1527.04 24.6144 0.0000e+00
12.082 12.000 21.8098 21.8045 21.8022 5.033462 35.4493 4.67184 0.4830 1.2395e+02 1.0651e+03 0.0666 35.4493 1527.05 24.6147 0.0000e+00
13.085 13.000 21.8060 21.8007 21.7982 5.033032 35.4488 4.66982 0.4961 1.1477e+02 1.0785e+03 0.0671 35.4486 1527.05 24.6153 0.0000e+00
14.093 14.000 21.8051 21.7999 21.7972 5.032948 35.4484 4.65928 0.4565 1.1152e+02 1.1353e+03 0.0656 35.4483 1527.07 24.6153 0.0000e+00
15.097 15.000 21.7792 21.7740 21.7710 5.030081 35.4471 4.66203 0.4687 1.0431e+02 1.3038e+03 0.0661 35.4466 1527.01 24.6213 0.0000e+00
16.108 16.000 21.7231 21.7179 21.7147 5.024489 35.4497 4.64409 0.4337 9.7048e+01 1.2136e+03 0.0647 35.4481 1526.88 24.6381 0.0000e+00
17.113 17.000 21.6283 21.6232 21.6198 5.015128 35.4548 4.63877 0.4348 8.8181e+01 1.1866e+03 0.0647 35.4514 1526.65 24.6671 0.0000e+00
18.117 18.000 21.5675 21.5623 21.5588 5.009032 35.4568 4.61598 0.4593 8.4134e+01 1.1696e+03 0.0657 35.4528 1526.51 24.6850 0.0000e+00
19.127 19.000 21.5131 21.5080 21.5043 5.003506 35.4573 4.57525 0.5364 8.3381e+01 1.1797e+03 0.0687 35.4533 1526.39 24.7005 0.0000e+00
20.128 20.000 21.4251 21.4200 21.4161 4.994048 35.4552 4.56028 0.5774 7.5384e+01 1.3882e+03 0.0702 35.4502 1526.16 24.7225 0.0000e+00
21.137 21.000 21.2996 21.2945 21.2904 4.980852 35.4549 4.53999 0.5317 6.8363e+01 1.1980e+03 0.0685 35.4483 1525.85 24.7557 0.0000e+00
22.143 22.000 21.2137 21.2086 21.2043 4.972081 35.4550 4.48993 0.4839 6.3423e+01 1.2170e+03 0.0666 35.4488 1525.63 24.7798 0.0000e+00
23.151 23.000 21.1549 21.1499 21.1454 4.965830 35.4534 4.47629 0.4570 5.9354e+01 1.2144e+03 0.0656 35.4470 1525.49 24.7946 0.0000e+00
24.158 24.000 21.1131 21.1080 21.1034 4.961652 35.4541 4.46035 0.4449 5.7152e+01 1.2104e+03 0.0651 35.4479 1525.40 24.8067 0.0000e+00
25.165 25.000 21.0963 21.0913 21.0864 4.960029 35.4535 4.43695 0.4811 5.5788e+01 1.3497e+03 0.0665 35.4484 1525.37 24.8117 0.0000e+00
26.172 26.000 21.0921 21.0871 21.0820 4.959645 35.4526 4.42531 0.4807 5.4281e+01 1.2645e+03 0.0665 35.4485 1525.37 24.8130 0.0000e+00
27.175 27.000 21.0872 21.0821 21.0769 4.959222 35.4524 4.42024 0.4323 5.1083e+01 1.2715e+03 0.0646 35.4488 1525.38 24.8147 0.0000e+00
28.186 28.000 21.0783 21.0733 21.0679 4.958398 35.4519 4.41756 0.4275 4.9388e+01 1.2812e+03 0.0645 35.4492 1525.37 24.8174 0.0000e+00
29.192 29.000 21.0722 21.0672 21.0616 4.957797 35.4513 4.41682 0.4030 4.7150e+01 1.2673e+03 0.0635 35.4491 1525.37 24.8191 0.0000e+00
30.197 30.000 21.0655 21.0605 21.0547 4.957158 35.4511 4.41458 0.4170 4.4495e+01 1.2614e+03 0.0641 35.4491 1525.37 24.8210 0.0000e+00
31.204 31.000 21.0522 21.0472 21.0412 4.955819 35.4508 4.41263 0.4055 4.2201e+01 1.3798e+03 0.0636 35.4491 1525.35 24.8246 0.0000e+00
32.213 32.000 21.0255 21.0204 21.0143 4.952889 35.4494 4.41073 0.3928 3.9706e+01 1.2389e+03 0.0632 35.4474 1525.29 24.8307 0.0000e+00
33.214 33.000 20.9885 20.9835 20.9771 4.949170 35.4495 4.39706 0.3948 3.7106e+01 1.2453e+03 0.0632 35.4478 1525.21 24.8411 0.0000e+00
34.232 34.000 20.9489 20.9438 20.9373 4.944946 35.4483 4.38765 0.3789 3.5226e+01 1.3157e+03 0.0626 35.4464 1525.12 24.8509 0.0000e+00
35.213 35.000 20.9034 20.8983 20.8916 4.940027 35.4444 4.35836 0.4258 3.5964e+01 1.5403e+03 0.0644 35.4443 1525.01 24.8617 0.0000e+00

What I would like to do is remove the header information all the way down to the *END*. I then need to add the text in bold to the beginning of each line so that it looks like:

Algoa ACEP II C09412 Oct 04 2009 05:18:3 26 23.607 S 032 57.018 E 35.213 35.000 20.9034 20.8983 20.8916 4.940027 35.4444 4.35836 0.4258 3.5964e+01 1.5403e+03 0.0644 35.4443 1525.01 24.8617 0.0000e+00

I then need to concatenate all these files into one massive file that can be imported into a program such as excel.
Can anyone give me some assistance with this project? Greatly appreciated.

I am very new to Pyhton and have very limited experience. The problem is I don't know where to start to develop a code to extracting my information.

Edited by TitusPE: Spelling

3
Contributors
12
Replies
13
Views
6 Years
Discussion Span
Last Post by pyTony
Featured Replies
  • 1

    Here is for starters filter to read in to list the numbers, the newline characters are still in place. This filter disregards the *END* tag and only reads lines with exactly 15 number values on line without any other information [CODE] def fifteennumbers(a): sep=[x for x in a if not … Read More

  • 1

    Here other filter with picking up the infos from beginning: [CODE]take = ["** Ship: ","** Cruise:","** Station:", "** Latitude:", "** Longitude:", "# start_time ="] info = "" lines = [] t=take.pop(0) ## take first marker from beginning of list take for i in open("stn001.txt").readlines(): if lines or i.startswith('*END*'): lines.append(i) elif … Read More

  • 1

    You should really study little more programming. You only move initialisations inside loop: [CODE]import os trans = 'transformed' list_of_tags = ["** Ship: ","** Cruise:","** Station:", "** Latitude:", "** Longitude:", "# start_time ="] txt_list = [ x for x in os.listdir(os.curdir) if x.startswith('stn') and x.endswith('.txt') ] if not os.path.isdir(trans): os.mkdir(trans) for … Read More

0

You would have to test each record for the specific info you want to keep, "Ship", "start_time", etc., and store it, probably in a dictionary. If the example you posted has not been "improved" for our readability then you can start writing once you hit *END*, headers first and then each record in turn. You would probably want a comma delimited file, if there isn't any commas in the records themselves, to import in Excel. Start with reading one file, one record at a time. Next, initialize a dictionary with the keys you want to search for. It appears that you can search the dictionary using the first word in each record, (strip non-letters and split). Post back with some code for more assistance and see 10.8, Text Files, here http://openbookproject.net/thinkcs/python/english2e/ch10.html

Edited by woooee: n/a

1

Here is for starters filter to read in to list the numbers, the newline characters are still in place.

This filter disregards the *END* tag and only reads lines with exactly 15 number values on line without any other information

def fifteennumbers(a):
    sep=[x for x in a if not x.isdigit() and x not in '.eE-+']
    return sep == [' ']*15+['\n'] # spaces and newline in the end

interesting = filter(fifteennumbers, open("stn001.txt").readlines())
print ''.join(interesting)

File formed from your message as attachment (.cnv was not allowed extension for files to upload)

Attachments
* Sea-Bird SBE 9 Data File:
* FileName = C:\CTD Data\Alg173\stn001.dat
* Software Version Seasave Win32 V 5.38
* Temperature SN = 4977
* Conductivity SN = 3436
* Number of Bytes Per Scan = 27
* Number of Voltage Words = 5
* Number of Scans Averaged by the Deck Unit = 1
* System UpLoad Time = Oct 04 2009 05:18:31
** Ship: Algoa
** Cruise: ACEP II
** Station: C09412
** Latitude: 26 23.607 S
** Longitude: 032 57.018 E
** Transect 1 - Mozambique
** Grazing
# nquan = 16
# nvalues = 31
# units = specified
# name 0 = prDM: Pressure, Digiquartz [db]
# name 1 = depSM: Depth [salt water, m]
# name 2 = t068C: Temperature [ITS-68, deg C]
# name 3 = t090C: Temperature [ITS-90, deg C]
# name 4 = potemp090C: Potential Temperature [ITS-90, deg C]
# name 5 = c0S/m: Conductivity [S/m]
# name 6 = sal00: Salinity [PSU]
# name 7 = sbeox0ML/L: Oxygen, SBE 43 [ml/l]
# name 8 = flECO-AFL: Fluorescence, Wetlab ECO-AFL/FL [mg/m^3]
# name 9 = par: PAR/Irradiance, Biospherical/Licor
# name 10 = spar: SPAR/Surface Irradiance
# name 11 = v1: Voltage 1
# name 12 = sal00: Salinity [PSU]
# name 13 = svCM: Sound Velocity [Chen-Millero, m/s]
# name 14 = sigma-00: Density [sigma-theta, Kg/m^3]
# name 15 = flag: flag
# span 0 = 5.029, 35.213
# span 1 = 5.000, 35.000
# span 2 = 20.9034, 21.8131
# span 3 = 20.8983, 21.8079
# span 4 = 20.8916, 21.8059
# span 5 = 4.940027, 5.033821
# span 6 = 35.4444, 35.4573
# span 7 = 4.35836, 4.67184
# span 8 = 0.3789, 0.5774
# span 9 = 3.5226e+01, 2.4172e+02
# span 10 = 9.8951e+02, 1.5403e+03
# span 11 = 0.0626, 0.0702
# span 12 = 35.4443, 35.4533
# span 13 = 1525.01, 1527.07
# span 14 = 24.6142, 24.8617
# span 15 = 0.0000e+00, 0.0000e+00
# interval = meters: 1
# start_time = Oct 04 2009 05:18:31
# bad_flag = -9.990e-29
# sensor 0 = Frequency 0 temperature, 4977, 17/04/2008
# sensor 1 = Frequency 1 conductivity, 3436, 15/04/2008, cpcor = -9.5700e-08
# sensor 2 = Frequency 2 pressure, 89112, 21/5/2003
# sensor 3 = Extrnl Volt 0 Oxygen, SBE, primary, 0591, 19/11/03
# sensor 4 = Extrnl Volt 1 WET Labs, ECO_AFL
# sensor 5 = Extrnl Volt 2 userpoly 0, BBRTD-385R, 20/07/2007
# sensor 6 = Extrnl Volt 3 transmissometer, primary, CST-970DR, 06/12/2006
# sensor 7 = Extrnl Volt 4 backscatterance, 2355, 29/10/2003
# sensor 8 = Extrnl Volt 5 irradiance (PAR), primary, 70168, 17/03/2008
# sensor 9 = Extrnl Volt 6 altimeter
# sensor 10 = Extrnl Volt 9 surface irradiance (SPAR), degrees = 0.0
# datcnv_date = Oct 07 2009 05:23:59, 7.16
# datcnv_in = D:\Alg 173\CTD Data\Alg173_Raw CTD data\stn001.dat D:\Alg 173\CTD Data\Alg173_Raw CTD data\algoa_0746_ACEP_072009.con
# datcnv_skipover = 0
# wildedit_date = Oct 07 2009 05:26:26, 7.16
# wildedit_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# wildedit_pass1_nstd = 2.0
# wildedit_pass2_nstd = 10.0
# wildedit_pass2_mindelta = 0.000e+000
# wildedit_npoint = 100
# wildedit_vars = prDM c0S/m sal00 sbeox0ML/L
# wildedit_excl_bad_scans = yes
# celltm_date = Oct 07 2009 05:29:16, 7.16
# celltm_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# celltm_alpha = 0.0300, 0.0000
# celltm_tau = 7.0000, 0.0000
# celltm_temp_sensor_use_for_cond = primary,
# filter_date = Oct 07 2009 05:32:55, 7.16
# filter_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# filter_low_pass_tc_A = 0.030
# filter_low_pass_tc_B = 0.150
# filter_low_pass_A_vars = prDM c0S/m
# filter_low_pass_B_vars =
# loopedit_date = Oct 07 2009 05:36:22, 7.16
# loopedit_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# loopedit_minVelocity = 0.250
# loopedit_surfaceSoak: do not remove
# loopedit_excl_bad_scans = yes
# Derive_date = Oct 07 2009 05:39:02, 7.16
# Derive_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv D:\Alg 173\CTD Data\Processed data\algoa_0746_ACEP_072009.con
# binavg_date = Oct 07 2009 05:42:22, 7.16
# binavg_in = D:\Alg 173\CTD Data\Processed data\stn001.cnv
# binavg_bintype = meters
# binavg_binsize = 1
# binavg_excl_bad_scans = yes
# binavg_skipover = 0
# binavg_surface_bin = no, min = 0.000, max = 0.000, value = 0.000
# file_type = ascii
*END*
5.029 5.000 21.8056 21.8003 21.7994 5.032665 35.4488 4.65753 0.4282 2.4172e+02 9.8951e+02 0.0645 35.4489 1526.92 24.6151 0.0000e+00
6.039 6.000 21.8092 21.8039 21.8027 5.033160 35.4494 4.65977 0.4554 1.9693e+02 1.0206e+03 0.0655 35.4495 1526.94 24.6147 0.0000e+00
7.045 7.000 21.8115 21.8063 21.8049 5.033505 35.4499 4.66138 0.4422 1.9115e+02 1.0276e+03 0.0650 35.4500 1526.97 24.6144 0.0000e+00
8.052 8.000 21.8119 21.8067 21.8051 5.033587 35.4498 4.66450 0.4545 1.7034e+02 1.0387e+03 0.0655 35.4499 1526.99 24.6143 0.0000e+00
9.059 9.000 21.8121 21.8069 21.8051 5.033646 35.4499 4.66610 0.4633 1.6056e+02 1.0427e+03 0.0658 35.4499 1527.00 24.6143 0.0000e+00
10.067 10.000 21.8131 21.8079 21.8059 5.033821 35.4500 4.67016 0.4665 1.5411e+02 1.0422e+03 0.0659 35.4501 1527.02 24.6142 0.0000e+00
11.072 11.000 21.8123 21.8070 21.8048 5.033758 35.4499 4.67162 0.4667 1.4018e+02 1.1178e+03 0.0660 35.4499 1527.04 24.6144 0.0000e+00
12.082 12.000 21.8098 21.8045 21.8022 5.033462 35.4493 4.67184 0.4830 1.2395e+02 1.0651e+03 0.0666 35.4493 1527.05 24.6147 0.0000e+00
13.085 13.000 21.8060 21.8007 21.7982 5.033032 35.4488 4.66982 0.4961 1.1477e+02 1.0785e+03 0.0671 35.4486 1527.05 24.6153 0.0000e+00
14.093 14.000 21.8051 21.7999 21.7972 5.032948 35.4484 4.65928 0.4565 1.1152e+02 1.1353e+03 0.0656 35.4483 1527.07 24.6153 0.0000e+00
15.097 15.000 21.7792 21.7740 21.7710 5.030081 35.4471 4.66203 0.4687 1.0431e+02 1.3038e+03 0.0661 35.4466 1527.01 24.6213 0.0000e+00
16.108 16.000 21.7231 21.7179 21.7147 5.024489 35.4497 4.64409 0.4337 9.7048e+01 1.2136e+03 0.0647 35.4481 1526.88 24.6381 0.0000e+00
17.113 17.000 21.6283 21.6232 21.6198 5.015128 35.4548 4.63877 0.4348 8.8181e+01 1.1866e+03 0.0647 35.4514 1526.65 24.6671 0.0000e+00
18.117 18.000 21.5675 21.5623 21.5588 5.009032 35.4568 4.61598 0.4593 8.4134e+01 1.1696e+03 0.0657 35.4528 1526.51 24.6850 0.0000e+00
19.127 19.000 21.5131 21.5080 21.5043 5.003506 35.4573 4.57525 0.5364 8.3381e+01 1.1797e+03 0.0687 35.4533 1526.39 24.7005 0.0000e+00
20.128 20.000 21.4251 21.4200 21.4161 4.994048 35.4552 4.56028 0.5774 7.5384e+01 1.3882e+03 0.0702 35.4502 1526.16 24.7225 0.0000e+00
21.137 21.000 21.2996 21.2945 21.2904 4.980852 35.4549 4.53999 0.5317 6.8363e+01 1.1980e+03 0.0685 35.4483 1525.85 24.7557 0.0000e+00
22.143 22.000 21.2137 21.2086 21.2043 4.972081 35.4550 4.48993 0.4839 6.3423e+01 1.2170e+03 0.0666 35.4488 1525.63 24.7798 0.0000e+00
23.151 23.000 21.1549 21.1499 21.1454 4.965830 35.4534 4.47629 0.4570 5.9354e+01 1.2144e+03 0.0656 35.4470 1525.49 24.7946 0.0000e+00
24.158 24.000 21.1131 21.1080 21.1034 4.961652 35.4541 4.46035 0.4449 5.7152e+01 1.2104e+03 0.0651 35.4479 1525.40 24.8067 0.0000e+00
25.165 25.000 21.0963 21.0913 21.0864 4.960029 35.4535 4.43695 0.4811 5.5788e+01 1.3497e+03 0.0665 35.4484 1525.37 24.8117 0.0000e+00
26.172 26.000 21.0921 21.0871 21.0820 4.959645 35.4526 4.42531 0.4807 5.4281e+01 1.2645e+03 0.0665 35.4485 1525.37 24.8130 0.0000e+00
27.175 27.000 21.0872 21.0821 21.0769 4.959222 35.4524 4.42024 0.4323 5.1083e+01 1.2715e+03 0.0646 35.4488 1525.38 24.8147 0.0000e+00
28.186 28.000 21.0783 21.0733 21.0679 4.958398 35.4519 4.41756 0.4275 4.9388e+01 1.2812e+03 0.0645 35.4492 1525.37 24.8174 0.0000e+00
29.192 29.000 21.0722 21.0672 21.0616 4.957797 35.4513 4.41682 0.4030 4.7150e+01 1.2673e+03 0.0635 35.4491 1525.37 24.8191 0.0000e+00
30.197 30.000 21.0655 21.0605 21.0547 4.957158 35.4511 4.41458 0.4170 4.4495e+01 1.2614e+03 0.0641 35.4491 1525.37 24.8210 0.0000e+00
31.204 31.000 21.0522 21.0472 21.0412 4.955819 35.4508 4.41263 0.4055 4.2201e+01 1.3798e+03 0.0636 35.4491 1525.35 24.8246 0.0000e+00
32.213 32.000 21.0255 21.0204 21.0143 4.952889 35.4494 4.41073 0.3928 3.9706e+01 1.2389e+03 0.0632 35.4474 1525.29 24.8307 0.0000e+00
33.214 33.000 20.9885 20.9835 20.9771 4.949170 35.4495 4.39706 0.3948 3.7106e+01 1.2453e+03 0.0632 35.4478 1525.21 24.8411 0.0000e+00
34.232 34.000 20.9489 20.9438 20.9373 4.944946 35.4483 4.38765 0.3789 3.5226e+01 1.3157e+03 0.0626 35.4464 1525.12 24.8509 0.0000e+00
35.213 35.000 20.9034 20.8983 20.8916 4.940027 35.4444 4.35836 0.4258 3.5964e+01 1.5403e+03 0.0644 35.4443 1525.01 24.8617 0.0000e+00
1

Here other filter with picking up the infos from beginning:

take = ["** Ship: ","** Cruise:","** Station:",
      "** Latitude:", "** Longitude:",
      "# start_time ="]

info = ""
lines = []

t=take.pop(0) ## take first marker from beginning of list take
for i in  open("stn001.txt").readlines():
    if lines or i.startswith('*END*'):
        lines.append(i)
    elif i.startswith(t):
            i = i[len(t):]
            info += i.rstrip()
            if take: t=take.pop(0)
lines=lines[1:] ## take out *END* line
for i in  [info+' '+j for j in lines]:
    print i,

Edited by pyTony: n/a

0

Thanx to tonyjv, can I now transfrom myfiles to the desired format.

I added a file open and write function to the code. This works well, but how can I get it to open all the txt files in my folder, transform the data of each file and then save it with the "station" as the filename?

take = ["** Ship: ","** Cruise:","** Station:",
      "** Latitude:", "** Longitude:",
      "# start_time ="]
 
info = ""
lines = []

myfile = open('stn001_transformed.txt', 'w')

t=take.pop(0) ## take first marker from beginning of list take
for i in  open("stn001.txt").readlines():
    if lines or i.startswith('*END*'):
        lines.append(i)
    elif i.startswith(t):
            i = i[len(t):]
            info += i.rstrip()
            if take: t=take.pop(0)
lines=lines[1:] ## take out *END* line
for i in  [info+' '+j for j in lines]:
    myfile.write(i)
myfile.close()
0

How about like this? It puts the transformed files in subdirectory transformed (which you can set by changing the trans variable).

import os
trans = 'transformed'
info = ""
lines = []
list_of_tags = ["** Ship: ","** Cruise:","** Station:",
      "** Latitude:", "** Longitude:",
      "# start_time ="]

txt_list = [ x for x in os.listdir(os.curdir) if x.startswith('stn') and x.endswith('.txt') ]
if not os.path.isdir(trans):
    os.mkdir(trans)
for fn in txt_list:
    print fn,
    take = list_of_tags[:] ## copy of tag list because pop is destructive
    t = take.pop(0) ## take first marker from beginning of list take
    for i in  open(fn).readlines():
        if lines or i.startswith('*END*'):
            lines.append(i)
        elif i.startswith(t):
                i = i[len(t):]
                info += i.rstrip()
                if take: t=take.pop(0)
    lines = lines[1:] ## take out *END* line

    myfile = open(os.path.join(trans,fn), 'w')

    for i in  [info+' '+j for j in lines]:
        myfile.write(i)
    myfile.close()

Edited by pyTony: n/a

0

Hi,

Thank you for the response. It works, saves the new text files in the new folder. List all the different files. However, there is a hick-up. It uses the data from the first file (stn001) for the other files. It doesn't use the data from stn002, etc.

0

lines=[] must obviously be moved to right place, it needs to be emptied for every file. I had only one data file and I made copies of that, so I did not notice the omission.

0

Hi. Sorry for the late response. Was away on a trip. I get it to work, using different files but don't know how to empty or dump the previous header information.

1

You should really study little more programming. You only move initialisations inside loop:

import os
trans = 'transformed'
list_of_tags = ["** Ship: ","** Cruise:","** Station:",
      "** Latitude:", "** Longitude:",
      "# start_time ="]

txt_list = [ x for x in os.listdir(os.curdir) if x.startswith('stn') and x.endswith('.txt') ]
if not os.path.isdir(trans):
    os.mkdir(trans)
for fn in txt_list:
    print fn,
    lines = []
    info = ""
    take = list_of_tags[:] ## copy of tag list
    t = take.pop(0) ## take first marker from beginning of list take
    for i in  open(fn).readlines():
        if lines or i.startswith('*END*'):
            lines.append(i)
        elif i.startswith(t):
                i = i[len(t):]
                info += i.rstrip()
                if take: t=take.pop(0)
    lines = lines[1:] ## take out *END* line

    myfile = open(os.path.join(trans,fn), 'w')

    for i in  [info+' '+j for j in lines]:
        myfile.write(i)
    myfile.close()
Votes + Comments
Very helpful
0

Close the thread and thanks for your reputation comments. Need those as my reputation is going down hill nowadays.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.