|
Post by smmsamm on Nov 1, 2018 6:59:49 GMT -5
Would you Please guide me (maybe Simple and fast query if there is or some fast code) to convert my CSV data file (with commas separation):
1,A,C,Z,F,G 2,G,Q,R,C, 3,Z,G,Q, 4,C,F, 5,O,P, 6,O,X,Y,J, 7,A,P,X,
I have this table with ~10,000 records like these 7 records that you see (In real Database A,B,C,... are words in string), Records 1 and 2 are common in G and C value and 2,3 and 1,3 and ...
I want to sync records if they have at least two common value like Records 1 & 2,3,4 (but record 5,6,7 haven't at least 2 shared values with others) and generate a list like this:
1 A C Z F G Q R 2 G Q R C A Z F 3 Z G Q A C F R 4 C F A Z G Q R 5 O P 6 O X Y J 7 A P X
at the end we must have 4 same records if we sort data and one others without sync:
1 A C F G Q R Z 2 A C F G Q R Z 3 A C F G Q R Z 4 A C F G Q R Z 5 O P 6 J O X Y 7 A P X
Maybe I do not use good term for my meaning, please see:
1 A C Z F G 2 G Q R C
record 1 has C and G common with Record 2 now 1 has not R and Q thus we must have 1 A C Z F G + Q and R and Record 2 has not A,Z and F thus we must have: 2 G Q R C + A,Z and F thus at the end we have:
1 A C Z F G Q R 2 G Q R C A Z F
I need all records Respectively in the queue from top to bottom. wrote a delphi code but it is so slow. Someone suggest me this groovy code:
def f=[:] new File('Data.csv').readLines().each{ def items=it.split(',') def name items.eachWithIndex { String entry, int i -> if(i==0){ name=entry } else if(entry){ if(!f[entry]) f[entry]=[] f[entry]<<name } }
} f.findAll {it.value.size()>1}
It is very fast (because of using a map file I think), but It only finds the common values.
|
|
|
Post by tsh73 on Nov 1, 2018 8:41:33 GMT -5
Hello smmsamm
This is BASIC forum. More, this is Liberty BASIC forum, folks program "Liberty BASIC" here. If you explain your problem you could get answer in Liberty BASIC language. Now, will it help you in any way? What language code you posted written in?
EDIT ah I see. In "groovy code", "Groovy" is actual name of the language. One could only wonder ;')
As for now, your explanation says little to me I get how you process last two lines 1 A C Z F G 2 G Q R C to 1 A C Z F G Q R 2 G Q R C A Z F but how/why your 4,C,F, ended up as 4 C F A Z G Q R ? (where block C F A Z G Q R starts and where it ends, and if we have say first line duplicated as line #10, what will happen?)
|
|
|
Post by smmsamm on Nov 1, 2018 12:13:45 GMT -5
thank you for your time
1,A,C,Z,F,G 2,G,Q,R,C, 3,Z,G,Q, 4,C,F, 5,O,P, 6,O,X,Y,J, 7,A,P,X,
at first we have : 1 A C Z F G Q R 2 G Q R C A Z F 3 Z G Q A C F R 4,C,F, 5,O,P, 6,O,X,Y,J, 7,A,P,X,
Now at 4 we have C and F common with 1 thus we generate: 4 C F + a z g q r
|
|
|
Post by mknarr on Nov 1, 2018 15:00:16 GMT -5
I think I understand whjat you are doing and have the following code worked up.
array1$(1)="1,A,C,Z,F,G" array1$(2)="2,G,Q,R,C," array1$(3)="3,Z,G,Q," array1$(4)="4,C,F," array1$(5)="5,O,P," array1$(6)="6,O,X,Y,J," array1$(7)="7,A,P,X, " numberdata=7 for x=1 to numberdata array1$(x)=trim$(replstr$(array1$(x),","," ")) next for x=1 to numberdata: print array1$(x):next:print trace 2 for x=1 to numberdata a$=after$(array1$(x)," ") 'a$=1 A C Z F G print a$ 'code to do commonalty to go here.
next
But am I missing something. I undestand how 1 2 3 an4 end up but there appears to be som commonality between 5, 6 and 7 and 1 and 7. I have started the code to creat the common lines but had to stop when I ran into that issue.
|
|
|
Post by meerkat on Nov 5, 2018 2:39:31 GMT -5
Sorry I'm late on this, but I thought I'd give it a try in a database. I created a table comn with values x1 to x7. Inserted some values, and did a query. The query was very fast - at least for the limited entries.
CREATE TABLE comn ( n INT(3), x1 char(1), x2 char(1), x3 char(1), x4 char(1), x5 char(1), x6 char(1), x7 char(1)) ;
insert into comn values (1,'A','C','F','G','Q','R','Z'), (2,'G','Q','R','Z','','',''), (3,'Z','G','Q','','','',''), (4,'C','F','','','','',''), (5,'O','P','','','','',''), (6,'J','O','X','Y','','',''), (7,'A','P','X','','','','')
SELECT *, COALESCE (c1.x1 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7),0) + COALESCE (c1.x2 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7),0) + COALESCE (c1.x3 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7),0) + COALESCE (c1.x4 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7),0) + COALESCE (c1.x5 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7),0) + COALESCE (c1.x6 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7),0) + COALESCE (c1.x7 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7),0) as match FROM comn AS c JOIN comn AS c1 ON c1.n > c.n AND (c1.x1 <> '' AND c1.x1 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7) OR c1.x2 <> '' AND c1.x2 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7) OR c1.x3 <> '' AND c1.x3 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7) OR c1.x4 <> '' AND c1.x4 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7) OR c1.x5 <> '' AND c1.x5 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7) OR c1.x6 <> '' AND c1.x6 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7) OR c1.x7 <> '' AND c1.x7 in (c.x1,c.x2,c.x3,c.x4,c.x5,c.x6,c.x7) ) WHERE match > 1 ORDER BY c.n,c1.n
Hope this helps.
|
|