python - How can I move data from one line to another using pandas -
i'm reading in file called looks this:
label dataset sw sf 1h 1h_2 noesy_f1ef2e.nv 4807.69238281 4803.07373047 600.402832031 600.402832031 1h.l 1h.p 1h.w 1h.b 1h.e 1h.j 1h.u 1h_2.l 1h_2.p 1h_2.w 1h_2.b 1h_2.e 1h_2.j 1h_2.u vol int stat comment flag0 flag8 flag9 0 {1.h1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {2.h8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 1 {2.h8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.h1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 2 {1.h8} 8.13712 0.05000 0.10000 ++ {0.0} {} {1.h1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 3 {1.h1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {1.h8} 8.13712 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 4 {2.h8} 7.61004 0.05000 0.10000 ++ {0.0} {} {2.h1'} 5.90291 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 5 {2.h1'} 5.90291 0.05000 0.10000 ++ {0.0} {} {2.h8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 6 {2.h8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.h1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 7 {2.h8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.h8} 8.13712 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 8 {1.h1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {2.h8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 9 {1.h8} 8.13712 0.05000 0.10000 ++ {0.0} {} {2.h8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0 i want values columns 1h.l, 1h.p, 1h_2.l, , 1h_2.p. code:
import pandas pd result={} df = pd.read_csv("peaks_ee.xpk", sep=" ", skiprows=5) shift1 = df["1h.p"] shift2 = df["1h_2.p"] mask = ((shift1>5.1) & (shift1<6)) & ((shift2>7) & (shift2<8.25)) result = df[mask] result = result[["1h.l","1h.p","1h_2.l","1h_2.p"]] col in result.columns: if col == ("1h.l") or col==( "1h_2.l"): result[col]=result[col].str.strip("{} ") result.drop_duplicates(keep='first',inplace=true) tclust_atom=open("tclust_ppm.txt","w+") result.to_string(tclust_atom, header=false) this output:
0 1.h1' 5.82020 2.h8 7.61004 3 1.h1' 5.82020 1.h8 8.13712 5 2.h1' 5.90291 2.h8 7.61004 11 4.h1' 5.74125 3.h6 7.53261 12 3.h1' 5.54935 4.h8 7.49932 15 3.h1' 5.54935 3.h6 7.53261 18 2.h1' 5.90291 3.h6 7.53261 21 4.h1' 5.74125 4.h8 7.49932 27 6.h1' 5.54297 5.h6 7.72158 32 4.h1' 5.74125 5.h6 7.72158 what want output this:
1.h1' 5.82020 0.3 2.h8 7.61004 0.3 1.h8 8.13712 0.3 2.h1' 5.90291 0.3 4.h1' 5.74125 0.3 3.h6 7.53261 0.3 3.h1' 5.54935 0.3 4.h8 7.49932 0.3 3.h1' 5.54935 0.3 3.h6 7.53261 0.3 6.h1' 5.54297 0.3 5.h6 7.72158 0.3 i want put of 2 columns, , don't want duplicates of anything. how can put values of third , fourth columns of current output first , second columns , not include duplicates? , how can add constant value (0.3) in third column?
edit: updated code:
import pandas pd result={} df = pd.read_csv("peaks_ee.xpk", sep=" ", skiprows=5) shift1 = df["1h.p"] shift2 = df["1h_2.p"] mask = ((shift1>5.1) & (shift1<6)) & ((shift2>7) & (shift2<8.25)) result = df[mask] result = result[["1h.l","1h.p","1h_2.l","1h_2.p"]] col in result.columns: if col == ("1h.l") or col==( "1h_2.l"): result[col]=result[col].str.strip("{} ") res = pd.lreshape(df, {'atom_name':['1h.l','1h_2.l'], 'ppm': ['1h.p','1h_2.p']}).drop_duplicates() res['new']=0.3 result.drop_duplicates(keep='first',inplace=true) tclust_atom=open("tclust_ppm.txt","w+") result.to_string(tclust_atom, header=false)
res.to_string(tclust_atom, header = false) and output:
0 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {1.h1'} 5.82020 0.3 1 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {2.h8} 7.61004 0.3 2 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {1.h8} 8.13712 0.3 5 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {2.h1'} 5.90291 0.3 10 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {3.h6} 7.53261 0.3 11 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {4.h1'} 5.74125 0.3 12 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {3.h1'} 5.54935 0.3 13 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {4.h8} 7.49932 0.3 26 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {5.h6} 7.72158 0.3 27 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {6.h1'} 5.54297 0.3 29 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {5.h2'} 4.26210 0.3 35 0.1 ++ {0.0} {} 0.05 0.1 ++ {0.0} {} 0.05 {} 0 0 0 100.0 0 0.0 {7.h8} 8.16859 0.3
iiuc can use pd.lreshape:
in [41]: df out[41]: c1 c2 c3 c4 0 1.h1' 5.82020 2.h8 7.61004 3 1.h1' 5.82020 1.h8 8.13712 5 2.h1' 5.90291 2.h8 7.61004 11 4.h1' 5.74125 3.h6 7.53261 12 3.h1' 5.54935 4.h8 7.49932 15 3.h1' 5.54935 3.h6 7.53261 18 2.h1' 5.90291 3.h6 7.53261 21 4.h1' 5.74125 4.h8 7.49932 27 6.h1' 5.54297 5.h6 7.72158 32 4.h1' 5.74125 5.h6 7.72158 in [43]: res = pd.lreshape(df, {'key':['c1','c3'], 'val':['c2','c4']}).drop_duplicates() in [44]: res out[44]: key val 0 1.h1' 5.82020 2 2.h1' 5.90291 3 4.h1' 5.74125 4 3.h1' 5.54935 8 6.h1' 5.54297 10 2.h8 7.61004 11 1.h8 8.13712 13 3.h6 7.53261 14 4.h8 7.49932 18 5.h6 7.72158 add third column '0.3'
in [45]: res['new'] = 0.3 in [46]: res out[46]: key val new 0 1.h1' 5.82020 0.3 2 2.h1' 5.90291 0.3 3 4.h1' 5.74125 0.3 4 3.h1' 5.54935 0.3 8 6.h1' 5.54297 0.3 10 2.h8 7.61004 0.3 11 1.h8 8.13712 0.3 13 3.h6 7.53261 0.3 14 4.h8 7.49932 0.3 18 5.h6 7.72158 0.3
Comments
Post a Comment