Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

Newcomers DBPro Corner / Beginners Guide To Programming - Part IV File Access

Author
Message
TDK
Retired Moderator
21
Years of Service
User Offline
Joined: 19th Nov 2002
Location: UK
Posted: 21st Dec 2006 03:25 Edited at: 21st Dec 2006 03:45
TDK_Man's Dark Basic Programming For Beginners


Part 4 - File Access


All but the most basic programs use file access. Although strictly speaking this also encompasses DB's commands for loading media files for your games such as images, sounds and models, this part of the tutorial series covers saving your program's data to disk and reading it back in again.

This process is required for reading INI files, saving hiscore tables or creating new file formats for your new world editor.

The basic process is to open a file for reading or writing, read in (or write out) the data then close the file. What you write is up to you, so long as you read the information back in the same order.

Now I know many people will argue with me, but I have decided that it's far simpler to write data as ASCII text files when you are learning to program. The main benefit is that you can open up your files after they have been created to see if they actually contain what you thought you had written. Some of DB's save data commands create encrypted files which can't be opened for examination - despite any advantages they may have.

The first thing you have to do is open a file. Assuming that we need to write a file before we are able to read it back in, this is done with OPEN TO WRITE.


OPEN TO WRITE

This command will create a new file on your hard disk and uses the syntax:

OPEN TO WRITE Channel,Filename$

Channel is an integer number and is like a 'stream' number. Filename$ is the filename you want to use.

The channel number is used because you can open more than one channel at a time. For example, you can open channel 1 to read and channel 2 to write simultaneously, allowing you to read from one file and write selected parts of it to a second file at the same time. By including the channel number in all of the commands, DB knows which file to access.

It's like connecting a pipe from DB to the file on disk. The channel number tells DB which pipe to send the data down when writing and which pipe to take the data from when reading. As long as you number the pipe(s), open the correct valves (READ or WRITE) before using the pipe and remember to close the valves when you are finished, you can use as many pipes as you need.

Filename$ can be a specific filename including the full path like:



It can also be relative, so using just a filename like 'Mydata.dat', the file will be opened in the current project directory (where your DB program is located).

If you have a directory called 'DATA' in the current directory and wanted to save your data to a new file in there, you would set the filename to 'DATA\Mydata.dat'. The process will fail if the directory DATA does not exist though.

So, to open a file called 'Mydata.dat' in the current directory we would use:

Open To Write 1,"Mydata.dat"

This creates an empty file called "Mydata.dat" and connects our 'pipe' which is labelled '1'. But, it is very important that the named filename DOES NOT ALREADY EXIST. If it does, then you will get an error. To avoid this, you have the FILE EXIST() function.


FILE EXIST()

So, before creating a new file, you should always check to see if it exists already with:



Here, the File Exist() function must be given the exact filename string as is used in the Open To Write command or you may not be checking for the existence of the file in the same location. It therefore makes sense to use a variable for the filename - rather than entering the filename literally:

Filename$="Mydata.dat"

If the file does exist, then the File Exist() function will return 1 (true) and if it doesn't exist will return 0 (false). So, in our example, the code between the If and Endif lines will only be carried out if the file does exist.

I put 'Do Something About It' in the above example because you have two options at this point. As you cannot open a file to save if it already exists, it HAS to be deleted so you can re-create a new one. But, what if the file is there and contains data which you don't want to lose?

Well we'll cover that later, but for now, we'll assume that it can just be deleted. So, we use Delete File:



which can be shortened to:

If File Exist(Filename$) Then Delete File Filename$

Here, if you don't say =1, then it is 'implied' - in other words, DB assumes you are testing for true (=1). Also, as you only have a single action to carry out - not multiple lines of code, you can add the keyword THEN and include the action on the end of the IF line.

So, having checked for the existence of the file, deleted it if it was found and opened a new file for writing, we now have to write our data to disk.

Normally, this data would be variables. If you were writing say a matrix editor then all of the matrix data the user has created or altered would be in variables like MatrixWidth, MatrixHeight, TilesX and TilesZ etc. All we need to do is write all these relevant variables to disk.

Once the file has been opened, there are a number of commands to write different types of data. These include WRITE BYTE, WRITE FLOAT, WRITE FILE and WRITE LONG - each of which writes data in an encrypted format.

When I say encrypted I simply mean that you can't read the data with anything other than DB's respective READ command. Use WRITE FLOAT and you can only access the data with DB's READ FLOAT - you can't open it with say Windows Notepad and examine the contents. I am also reliably told that the formats DB uses cannot be loaded into other programming languages like VB either.

There is a way around this though, by using WRITE STRING for everything. As mentioned earlier, when you are learning DB, then I think it's important that you are able to write some data to a file then open it in Notepad and see if it contains what you actually thought you were writing.

The fact that all your output is strings is irrelevant - the same data is still stored and you are still learning how to save data to disk.

So, let's see some WRITE STRING examples:

Write String 1,"This is a sample text string!"

A$="This is a sample text string!"
Write String 1,A$


OK, these both do the same thing. The first example writes the literal string enclosed in the quotes to disk, (but not the actual quotes). You could use this method for the very first line of your file to write a header description of the file so if anyone opened the file to look at it, they would see what the file was for. For example, to identify them, MatEdit's MA0 matrix files all have the following first line:

MatEdit .MA0 File

Lines can be ignored by your loading routine, so you can create as big a header as you like.

The second example is what you use to write string variables. But, what if your variables are numeric - not string?

That's not a problem, we just convert them to strings when we write them out. For example:

MatrixWidth=20000
MatrixHeight=20000
TilesX=70
TilesZ=70
FloatVar#=44.82

Write String 1,"This is the header"
Write String 1,Str$(MatrixWidth)
Write String 1,Str$(MatrixHeight)
Write String 1,Str$(TilesX)
Write String 1,Str$(TilesZ)
Write String 1,Str$(FloatVar#)


As you can see, the use of Str$() converts the numeric variables to strings before writing them. The original variables are not altered in any way by this process. As you can see, the process also works with float (real) numbers too. If you opened the above resulting file with Notepad you would see:

This is the header
20000
20000
70
70
44.82

Having written our data out, we need to close the file. This is done very simply with:

Close File Channel

...where Channel is the channel number used when opening the file.

The complete routine for our example would therefore be:



OK, that's written an example file, but what about reading the information back in?


OPEN TO READ

This process is very similar to writing files but using Read instead of Write. It's probably easier to show you the complete routine for reading the file generated by the above example code then discussing it afterwards:



OK, first of all, we check for the existence of the file we are trying to load. To avoid errors we only open the file if it's there. If it isn't then we don't attempt to open it. That's why all the reading code is enclosed inside the If File Exist(Filename$) loop.

If the file does exist then we use OPEN TO READ along with READ STRING to get the data. As we know that all the data in the file is of type string, we can use the same string variable (T$) to read each data item in and then convert it where necessary.

There's no way to detect automatically what type of data is in a file, but as you are reading the same data that you wrote out, you already know what each string you read in has to be converted to - if it isn't actually a string. You just have to make sure that you load data strictly in the same order that you wrote it out or nothing will work!

The first of our data items is a text header. As this is unwanted information, we can ignore it once it is loaded, though it MUST be loaded as it's part of the file. Data files are sequential so in order to read say the third item in the file, the first two must be loaded first. So the rule is load EVERYTHING and ignore what you don't want!

The next item of our example is MatrixWidth which is numeric, so once the string version of the value has been loaded into the variable T$ we need to convert it to a numeric value with VAL().

After it is read in, T$ will equal "20000" so MatrixWidth=Val(T$) will convert T$ to the number 20000 and place that value into the numeric variable MatrixWidth.

The process is repeated re-using T$ for the remaining numeric variables in the file.

The last data item is a float. Val() doesn't mind, it will still convert the string "44.82" to the numeric value 44.82 as long as you use a float type variable to receive it. FloatVar#=Val(T$) will result in FloatVar# containing 44.82 which is what we want. However if you miss off the # symbol then FloatVar=Val(T$) will result in FloatVar equalling 44 because without the # it is an integer variable and you will lose the .82 off the end!

Finally the file is closed.


Saving Arrays:

There is a command in DB for saving arrays, but you cannot save more than one array in the same file as the command has to be supplied with the filename. Using the method we will discuss next allows you to save all the arrays from your program that you want - all in the same file. This is essential if you want to create your own file format.

Arrays are no more than simple variables in blocks. Each variable in the array can be accessed by using the array's index number and if you can access a variable, you can save it out to disk. Here's a useful example...


Hiscore Tables

Creating a hiscore table in your program is easy enough, but if it doesn't write the data to disk, the next time the program is run, all the hiscores are lost.

So, let's assume that our game has a hiscore table which holds the top 10 hiscores and the names of the players who scored them. For this we need two very simple arrays - Hiscore() and PlayerName$(). Hiscore() is an integer array as the hiscores will be numeric and PlayerName$() is naturally a string array.

These are created with:

Dim Hiscore(10)
Dim PlayerName$(10)


For these tutorials, once again I am purposely ignoring the fact that element 0 exists in an array as it makes life easier - we can refer to players/hiscores 1 to 10 rather than 0 to 9. The file on disk will be called HISCORE.DAT.

So, when your game runs it checks to see if the file HISCORE.DAT exists. If it's the very first time it has been run, then the file will not exist so it must be created and the arrays written out to disk. At this time they will obviously all be empty or contain 0 (zero).

At this point, the arrays written to disk are the same as in memory. The player plays the game and if their score gets on the hiscore table, the arrays are modified. Obviously the first time the game is played, ANY score will get onto the table so they enter their name and the data is stored in the two arrays.

When the game is exited, the existing file HISCORE.DAT is deleted (we already have a later version in memory) and the new contents of the two arrays written out to the file HISCORE.DAT.

The next time the game is run and it checks to see if the file HISCORE.DAT exists, it will be there, so instead of creating a new one, the old hiscore table is read in. Once in memory, our two arrays can be modified when a new hiscore is attained and on exit the hiscore table is just written out again - regardless of whether or not it has changed since last time.

Writing arrays are very simple. All we have to do is write the data in a loop which matches the size of the array. For Next loops are ideal for this. So, to write our array Hiscore() to disk with 10 elements, we would use:



As you can see, Str$() is used as before to convert the numeric array data to string when writing it out to disk.

Reading the array back in is also just as simple:



When writing string arrays, there is no need to convert the data, so we skip the Str$() section and just use:



Reading the string array back in is done with:



Saving Multi-Dimensioned Arrays:

If the array you want to save is a multi-dimensioned array, then the process is identical - we just alter the loop accordingly. To save a numeric integer array which was created with DIM MultiArray(10,5) we would use:



Here, this nested loop will use Nx to write the 10 Nx array values for every Ny value in the Ny loop. So, the contents of MultiArray() will be written using Nx from 1 to 10 with Ny=1, followed by Nx from 1 to 10 with Ny=2 and so on until Ny=5.

Reading back in is the same as with single dimensioned arrays, but using exactly the same nested loop.



OK, that's how data in arrays is saved to disk and read back in again. Once again, I will stress that it's very, very important that you read in the information in EXACTLY the same order that it was written out. Failure to do this can cause problems - especially when you realise that it is possible for the data you are reading in to be fed into the wrong variables. Your program will often not error during the load process in cases like this as the routine will load any data into any variables so long as the variable types match - they just won't work properly and the problem could be very difficult to trace.

So back to our hiscore example...

What we have to do now is place a small routine at the beginning which checks for the hiscore data file, creates it if it doesn't and reads it in if it does:



In your game, you write the code which checks the players score at the end of each game and if it's higher than the lowest score in the hiscore table, ask for the players name, inserts the name and score into the two arrays - pushing the bottom entry off the list.

On exiting the program, we know that the file definitely exists so we just delete it and create a new file containing the contents of the hiscore arrays currently in memory - ready for being read in the next time the program is run.




[/b]File Formats[/b]

As you have seen, you can write many different types of variables while a file is open for writing, so when there is a lot of data to be written it's worth planning what order to write the data.

The structure of your data file is called a 'File Format' and all files created with Windows applications have one. There's a bitmap file format, a Microsoft Word file format and so on.

The file format defines for other users the layout of your file and what information can be found where, so they can add routines to their programs giving them the ability to load files created by your programs.

For example in a graphics file format one part of the file is the header, one is reserved for the colour palette and another part of the file will be the data which makes up the picture. You decide where the data goes in your own file format.

There are no fixed rules for designing a file format, just write the data out sensibly and logically. MatEdit for example creates a .MDF file with the Build option. If you were to look at an MDF file you would just see numbers - lots of them. Publishing the file format simply describes to others what these number are, what variable types they are and so on.

As a rule of thumb, you should have a description of the file type at the start saying what the file is used with. The numeric and string variables should come next and finally all the array data. Try not to have too much unwanted information like comments scattered about the file as it complicates the load routine - you still have to load all the useless information even though you are immediately going to discard it.


Loading Routines

If you write a program which creates a data file usable by other people you will also need to create a loading routine in DB which is supplied with your program. This will normally be a function (or collection of functions) which users can #Include in their programs so they can call the functions when required.

If you write a matrix editor or world editor then you want people to be able to use the creations made with your program in their own DB programs. If you don't provide them with a simple way to do this, then they are not going to want to use your program.


Reading Other Files

Open To Read isn't just restricted to reading files you created yourself with Open To Write. It can also be used to read information in from other files too. As long as you know the file format, you can read data in from graphics and text files.

One of the easiest files to read in are plain ASCII text files created with a text editor as each line is going to be a string.

However there is a limit of 255 characters with DB's strings so if the text file you are reading in has a line greater than 255 then the reading will end abruptly with an error. We'll ignore this point for the moment though and return to it later...

Also, another question is 'how much data do we read in'? As we didn't create the file, we have no idea how long the file is!


FILE END()

Luckily, DB gives us a function called FILE END() which uses the syntax:

File End(Channel)

...where Channel is the same as the channel used with Open To Read. This will return true (1) if the end of the file has been reached or false (0) if there is still more data to be read in.

Using this function in a loop, we can read all of the data in the file without having to know how much is there first. The data from a string-type file like this is usually done with a string array. You just need to dimension the array with a large enough number of subscripts before reading in the file or an error will occur while reading. Let's see an example:



This example creates a string array with 5000 elements and is thus able to read up to 5000 lines from a text file. The filename is set to DOCUMENT.TXT and we use our usual method of placing the loading code inside an If...Endif which checks to see if the named file exists.

The important part of this example is that we are not using a For...Next loop any longer as we don't know how many lines there are in the file - and we therefore don't have any start and end values for this kind of loop. Instead we use a Repeat...Until loop which uses File End() to check if the end of the text file has been reached.

The Read String line reads each piece of data into T$ and it is then placed into the string array using the numeric counting variable LineCount. If anyone is wondering why I use:

Read String 1,T$: TextLines$(LineCount)=T$

rather than

Read String 1,TextLines$(LineCount)

it's because I have encountered problems in the past when reading array values directly. Since using a normal string variable to read the data and then transferring the contents to an array I haven't encountered those errors. Feel free to use whichever method you like - the end result should be the same...

As the variable we use for counting in the loop would normally be the For...Next counting variable - which obviously is not available here - we have to increment LineCount manually each time around the loop. This is done with Inc LineCount and the line LineCount=0 is used before entering the loop - to ensure that the counting loop starts at 0 (in case the routine is used more than once).

This loop continues reading lines of text from the text file until there is no more lines to read and then drops out of the loop. At this point, LineCount is equal to the number of lines read in from the text file. Knowing this, we can add a For...Next loop to the end of the program which will print the lines read in to the screen:



And that's all there is to reading a text file.


Line Too Long?

Going back to earlier in this tutorial, I briefly mentioned that DB will error if you try to read in a string which is longer than 255 characters. So, what do you do if this happens?

Well basically, you switch to reading the line in a character at a time rather than a line at a time. This is quite a bit slower than reading in a line, but as it's the only way around the problem it's better than nothing.

For MatEdit Pro's in-built help files, I needed a text file of the MatEdit documentation, but with each line short enough to fit on the screen. The problem was that the existing docs contained quite large paragraphs and when exported as a text file, each paragraph became one single line - most of which were a lot larger than 255 characters in length!

Below is the small program I wrote to solve the problem. What it does is read data in from the file a byte (character) at a time in a loop until it is a given length, (or it reads in the two bytes 13 and 10 - the two values which record the end of a line in all text files), at which point a new line is started.

The two variables EndLineTrigger and ContainsWords are worth mentioning. When ContainsWords is set to 1 then the file being read in is deemed to be a document containing words and when set to 0, just data.

EndLineTrigger is the length of the required lines after reading in the data and what it does depends on what ContainsWords is set to. If ContainsWords is set to 0 and EndLineTrigger is set to 80 then each line is cut off at the 80th character.

If ContainsWords is set to 1 and EndLineTrigger is set to 80 then the line is cut off at the end of whatever word is at position 80.

When all the lines have been read in and shortened, they are written out to another file. Here's the program:



Use this program on any text file which you can't read with the Read String method. This will convert the file and give you a new file which can be loaded with the Read String method. The two filenames at the beginning of the program allow you to set the input and output filenames.

OK, that's it for the File Access tutorial. If you think there's some aspect of File Access you think I've missed and would like to see covered then let me know.

TDK_Man

Gil Galvanti
19
Years of Service
User Offline
Joined: 22nd Dec 2004
Location: Texas, United States
Posted: 21st Dec 2006 05:46 Edited at: 21st Dec 2006 05:47
Although I think it's great your writing all these tutorials, don't you think it'd be better if you contained it to one thread for easy access to all topics .
EDIT: Doh, just saw that theres a link sticky at the top, nevermind .

Pirates of Port Royale
Live the life of a pirate.
TDK
Retired Moderator
21
Years of Service
User Offline
Joined: 19th Nov 2002
Location: UK
Posted: 21st Dec 2006 05:53 Edited at: 21st Dec 2006 05:54
Yup there is!

I've been asked to put all my tutorials in there, but unfortunately, I can't add them into the sticky post unless they've been added to the board in the usual way, so there's no way around it I'm afraid.

Hopefully they'll soon disappear down the list...

Don't worry - you aren't the only one who doesn't read stickies...

TDK_Man

indi
21
Years of Service
User Offline
Joined: 26th Aug 2002
Location: Earth, Brisbane, Australia
Posted: 21st Dec 2006 06:06
I wish people would read them, however the awesome effort made by TDK allows us to point them to a collective link.
Rich wants us to try and keep sticky amounts to a minimum

Login to post a reply

Server time is: 2024-04-18 14:16:52
Your offset time is: 2024-04-18 14:16:52