Quote: "Sorry to double post but the forum is takeing ages to accept my last one."
All new members are on post approval for a while so we can weed out spammers and undesirables!
I (or another mod) have to approve all posts that you make during this period. So some delay is inevitable, depending on what mods are online when you post.
Quote: "I have probably messed it up again"
Yes - a bit, but it's all part of the learning process. I'm happy to help out, but only if you are happy drop the bad habits.
For example, these days my eyes are not as good as they used to be and I'm finding it more and more difficult to follow code which isn't indented correctly. To be honest, it's a lot to do with the font in the forum code boxes, but there's little we can do about that!
Note the 2 character indents of all loops and If..Then blocks in my code? Anyway, have a quick read of tutorial
here.
Quote: "I have started to look at writing somthing that remove all the rubbish"
By 'rubbish' I assume you mean groups of characters which are not proper English words right?
Quote: "how to collect all the words between spaces as strings and compair them to a word list"
Create a string variable like NewWord$ and set it to NULL with:
NewWord$=""
Next, you'll need to open a text file and read each byte (number) in one at a time, converting it to a character with Chr$() and adding it to the end of NewWord$.
This is repeated until a space character is read in.
You then compare this string with the dictionary, discarding it if it's not a found word and adding it to a string array if it is.
You then set NewWord$ back to NULL continue reading more bytes in until you reach the end of the file.
The big problem is that you can't open a text file while it's in use. So, I suggest that you set a fixed number of bytes to write to the text file (say 64000) then close it, create a new one then continue.
You can then work on the created files not in use.
Using a variable counter, you can get DB to automatically name the text files File000001.txt, File000002.txt, File000003.txt and so on - each containing say 64K of text.
There's a tutorial on strings on the page linked to above and you'll find one on arrays which you'll find explained in very simple terms.
Have a go with the outlines I've given you here and shout if you get stuck.
TDK_Man