Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

Geek Culture / Take on C ; (And all the great links on Compiliers)

Author
Message
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 17th Jun 2005 23:20 Edited at: 17th Jun 2005 23:21
I've stopped working on my language for a while. Not much free time. The little free time I have is for a stupidly large project I managed to get obsessed with (no, it's not an RPG. It's not even a game ).
I'll get back to language creation when I'm done with my current project (if ever).
*looks at watch*
ARG, I'm late for work!

David T
Retired Moderator
22
Years of Service
User Offline
Joined: 27th Aug 2002
Location: England
Posted: 17th Jun 2005 23:39
Quote: "The other dude's tutorial was Pascal. ARG. "


Is that a problem? You can just convert it...

I too have paused for a while on my uber-compiler written in c# (). The "uber" doesn't describe functionality, but the sheer amount of beaurocracy involved in the whole thing. I suppose that's part and parcel of having every feature wrapped up in it's own little class

"A book. If u know something why cant u make a kool game or prog.
come on now. A book. I hate books. book is stupid. I know that I need codes but I dont know the codes"
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 7th Jul 2005 04:26 Edited at: 7th Jul 2005 04:26
You say your making a C# compiler. What can it do?

[b]PowerScript: Currently Working on new VB version
David T
Retired Moderator
22
Years of Service
User Offline
Joined: 27th Aug 2002
Location: England
Posted: 7th Jul 2005 05:07
Not much on the outside, but it has very nice suports for all manner of arrays.

"A book. If u know something why cant u make a kool game or prog.
come on now. A book. I hate books. book is stupid. I know that I need codes but I dont know the codes"
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 1st Aug 2005 03:14
A little article on compiler construction and parsing.
http://www.cs.man.ac.uk/~pjj/farrell/compmain.html

Have recently been learning some 3D programming with OpenGL. Hope to have a nice little 3D wrapper complete soon.

Going to spend the rest of the night working on my interpreter. Progress has been slow, and I only have 1 month left of summer. My original deadline to have a working version of my interpreter was by the end of August, so I should still be able to hit that deadline.

How is progress coming along for everyone else?



A book? I hate book. Book is stupid.
(Formerly Yellow)
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 1st Aug 2005 05:13
I've stopped working on WarBasic for a long while now, but I'm doing something similar. I'm making a Dreamcast emulator, which is very close to the VM programming you'd do in a JITted language. It isn't exactly an interpretor, as it translates the code it runs and puts it in a code cache so it doesn't need to be translated again untill it's modified. Any languages I make in the future will be based on this engine as it has MUCH better results than plain interpreting.
I've almost finished the CPU emulation (over 556,000 lines of code in 15mb of cpp/h files >_< ), and I've done bits of memory emulation also. After that I have to do the video card (which is gonna be tough as I don't have much experience with low-level 3D graphics manipulation), MAPLE BUS (also gonna be tough because of the documentation), and some other stuff.

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 4th Aug 2005 06:11
@MikeS

How is progress coming along for everyone else?

I'm probably on my third or fourth rewrite of the compiler I'm working on. I've ported what I have over to FreeBasic and I've managed to grab a little bit of spare time to work on it some more. But with my new job I really don't have a whole lot of time to work on anything compiler related.

There is talk of Basic4gl(The BASIC dialect I'm aiming my compiler for) being discontinued at the moment so if it goes under I might just switch my compiler over to some other BASIC variant. Maybe even DBPro. I really don't want to commit a whole lot of time to writing a parser for a dialect of BASIC that will no longer exist or be used by anyone.

I'll finish up my new lexical scanner and take a look at the status of Basic4gl then. If it looks like Tom is going to call it quits, I probably move it on over to DBPro or make my own mini-dialect.

Either way, if I finish anything significant I plan on writing a series of tutorials on how to write a BASIC compiler for everyone. That should be fun.
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 4th Aug 2005 23:59
Wow TKF15H, sounds very impressive.



-------------------------
Good to hear you're still making progress. I heard about Basic4GL going under too. There were rumors about it going open source though, so then it might be worthwhile to just keep what you have. Either way, I'd love to see a series of tutorials for creating a BASIC compiler.



A book? I hate book. Book is stupid.
(Formerly Yellow)
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 5th Aug 2005 00:30
And the thread lives on

Yay

TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 6th Aug 2005 19:07
PowerSoft: What happend to PowerScript? You can't say "the thread lives on" and just walk out.

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 8th Aug 2005 12:31
:S


PowerScript is on hold (see sig)



Im just amazed that the thread is still in existence and hasnt auto locked itself

David T
Retired Moderator
22
Years of Service
User Offline
Joined: 27th Aug 2002
Location: England
Posted: 8th Aug 2005 12:49
Autolocking was disabled a while ago I think.

"A book. If u know something why cant u make a kool game or prog.
come on now. A book. I hate books. book is stupid. I know that I need codes but I dont know the codes"
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 8th Aug 2005 12:50
oh. That explains it then

TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 17th Nov 2005 00:33
@Neophyte (if you're still around ): I could really use that tutorial you were working on, even if it's incomplete. Could you please upload/e-mail it? I need it for my DC emulator. ^_^

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 17th Nov 2005 02:06
@TKF15H

I'm in the process of completeing the tutorial right now actually. I've been quite busy with work so I really haven't had the time to progress as far as I'd like, but I can give you a basic outline.

My plan so far is instead of working from a basic language and compiling into machine code, to work from an assembly language that compiles into machine code and work from there into a basic language. That should make the parsing side of things easier and allow me to dig into the intricisies of machine code generation.

So far I have the lexical scanner completed and I'm working on the simple parsing routines. I almost have the symbol table implementation completed. Just need to make the delete symbol and delete symbol table functions and I should be set.

Once, I've completed that I'll start with the tutorial series. Part 1 should cover symbol tables, linked lists, lexical scanning, simple parsing routines, and a skeletal code generator. When part 1 is completed, you will have a mini-assembler that will assemble a source file and output actual machine code that can then be linked into a executable.

I can't promise any deadlines right now, but I can promise you that I will finish this compiler. I also have to update my old tutorial concerning the MOV instruction as I've learned quite a bit about the complex addressing mode of x86 instructions since I wrote it. A new version of the tutorial will probably appear some where in my compiler series of tutorials.
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 17th Nov 2005 02:19
Yay! Good to know you're still working on this.
Since my project isn't a programming language, the lexical scanning and parsing bits aren't necessary (for now... I'll probably have to give them a look later on). I'm generating machine code from (SH4 machine code) into RAM and running it from there directly.
Some of the code in my emulator is based on your MOV tutorial, so if anything is wrong and needs updating, please tell!

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 19th Nov 2005 20:45 Edited at: 19th Nov 2005 21:15
Neophyte: Know if the Offset part of the MOV instruction is signed? If it is, then why is ESP used as a local-variable pointer, despite its value changing all the time? Doesn't seem to make any sense.

If I can't use [EBP - 4], I'm thinking of storing local variables before doing the "Mov EBP, ESP" so I can use [EBP + 4] instead. Wonder if this has any implications, as nobody ever does it like this.
Normal function structure (Intel syntax):

My function structure (no idea if this works or is any better than the normal header):


WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 21st Nov 2005 03:29
Yes, I believe it is signed. I believe that EBP is used as the local variable pointer instead of the ESP register so you can arbitrarily push and pop values whenever you need to.

The interesting thing about the x86 architecture is that the stack grows downward. So the stack layout would be like this:



So your version should work. It is a little unorthodox, but I can't figuare out any flaws in it right now. Of course, I haven't tested so due caution is appropriate when using that method.
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 21st Nov 2005 11:46
Quote: "It is a little unorthodox, but I can't figuare out any flaws in it right now."

That's just the thing... if it's easier to just use EBP rather than ESP to keep track of locals, then WHY do compilers go through the trouble of tracking ESP when pushing/popping things?!? Everybody uses ESP, and I just don't see a reason to do so, which is why I'm thinking twice about using EBP instead.

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
empty
22
Years of Service
User Offline
Joined: 26th Aug 2002
Location: 3 boats down from the candy
Posted: 21st Nov 2005 13:06 Edited at: 21st Nov 2005 13:06
Quote: "if it's easier to just use EBP rather than ESP to keep track of locals, then WHY do compilers go through the trouble of tracking ESP"

Not all do. In fact Borland compilers use EBP to calculate the offet to local variables (unless some of the countless optimisation routines prevent then from using the stack to store local variables at all). It's the standard way of stack framing.


Play Nice! Play Basic! Version 1.089
Three Score
20
Years of Service
User Offline
Joined: 18th Jun 2004
Location: behind you
Posted: 22nd Nov 2005 03:51 Edited at: 22nd Nov 2005 04:12
have your figured out what the sib byte in a mov instruction is for
(im just read through this thread today,for over an hour so..)

and one more thing what does this mean in the mod r/m table




[edited]

ok, I just hit him with a shovel. Is he still conscious? Yea, I think so. Then hit him again!
If at first your dont succeed, then skydiving is not for you
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 22nd Nov 2005 13:28
That is how you access a register. [EAX] (Mod 00) gets the value pointed to by EAX, and EAX (mod 11) gets the value stored in EAX.
Or at least, that's what I understood...

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Three Score
20
Years of Service
User Offline
Joined: 18th Jun 2004
Location: behind you
Posted: 22nd Nov 2005 23:27 Edited at: 22nd Nov 2005 23:29
yes but why the / and all the different registers and mm0/xmm0

edit:
@neophyte
can i host your mini mov tutorial on my website(of course with credits to you) because well, there is just nowhere but this thread that explains it and the tut is in the middle of the thread so a bit hard to find

ok, I just hit him with a shovel. Is he still conscious? Yea, I think so. Then hit him again!
If at first your dont succeed, then skydiving is not for you
empty
22
Years of Service
User Offline
Joined: 26th Aug 2002
Location: 3 boats down from the candy
Posted: 23rd Nov 2005 00:08
Quote: "yes but why the / and all the different registers and mm0/xmm0"

That's just the list of registers a certain Mod R/M byte applies to.


Play Nice! Play Basic! Version 1.089
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 23rd Nov 2005 00:15
Quote: " yes but why the / and all the different registers and mm0/xmm0"

Because it depends on the instruction. If you want to use AL, you use 000. If the instruction is a 16 bit one, it will use AX. If it's 32 bit opcode, it will use EAX. If it's an MMX instruction it will use.... (drum roll)... MMX!

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Phaelax
DBPro Master
22
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 23rd Nov 2005 08:12
Xcode, with Cocoa and objective C.


Deadly Night Assassins
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 23rd Nov 2005 13:38
Quote: " Xcode, with Cocoa and objective C."

Eh???

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Three Score
20
Years of Service
User Offline
Joined: 18th Jun 2004
Location: behind you
Posted: 23rd Nov 2005 19:31
did anyone else get that

btw
is neophyte on vacation or something

tutorials,programs,useful but simple php scripts, a place for code snipplets and more at
http://hackr83.0z0.co.uk
(still under construction)
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 25th Nov 2005 00:21 Edited at: 25th Nov 2005 00:31
Something I found googling: http://www.swansontec.com/sregisters.html
It's an article regarding "the art of register picking".
Good stuff. Reading this reminds me how much I hate the x86 architecture, and makes me wish I could get a G5.

http://www.unixwiz.net/techtips/win32-callconv-asm.html
Covers function calls (cdecl, stdcall). Basic stuff, probably covered previously in this thread, but good to have around.

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 25th Nov 2005 02:36
@TKF15H

I really haven't seen many compilers use ESP to fetch locals. The output of the few that I've looked at have always used EBP. What compiler did you use? The one's I've looked at including MASM all use EBP for locals.

Also, I believe that I read somewhere that using EBP is actually quicker than using ESP to access locals. This is due to some kind of one clock cycle delay when decoding instructions with ESP in the ModR/M byte. I'm not sure where I read it but I'm pretty sure it is true.

@Offset of reality

Not on vacation. Just rarely get around to posting in the forums these days.

Yes, I've figuared out what the SIB byte is. It is used for accessing arrays. Assume for a moment that you have an array of bytes called "MyArray". Assume also that you want to cycle through each element in the array. First you would load up the address of MyArray into a register. This register will be our "Base". We would then clear some register and use that to hold the index of our array. We'll call this the "Index"(noticing a pattern? ). Now in in order to get the first byte or our array we would use a piece of code like this:


What this does is take the pointer from our base register and add it to our index register. Using the resulting pointer it then fetches a byte into the al register. It might just seem easier to use MOV al, [EBX] since anything plus zero is going to be itself, and you'd be right. However, what if you are in a for next loop and are cycling through the array with the loop index? That is where the real power of the sib shines. If you held the loop index in the ECX register, you could access each byte in the array sequentially, because you'll be incrementing the for next counter each loop.

This is a very usefull optimization, however, there is a draw back. It will only work on arrays of bytes. If you were to have an array of integers, the first itineration would work out really well. However, with a index counter set to one you'd wind up only getting the second byte of the first integer, not the second integer! Allow me to explain with some simple math.

Suppose you have your array at address 10. Your ebx register(which contains the pointer to the array) would then hold the value 10. Since there are 4 bytes to an integer the next integer would be located at address 14. With an index of 0 you'd get the address of 10 since 10 + 0 (EBX + ECX) equals 10. However, with an index of 1 you'd get 11. This is fine for bytes, but anything larger like an integer it won't do, because the base pointer needs to increment by 4. Bit of a severe restriction, eh? But there is a solution:

Scale to the rescue! Scale can be one of any 4 values: 1, 2, 4, 8. And it works like this: The Scale value is multipled by index and the product is then added to the base. So with a scale value of 4 our instruction would look like this:
MOV EAX, [EBX + ECX * 4]
Using this feature we can now access arrays of integers in a for next loop with the loop counter. As I said before, this is limited to 4 values and one of them is redundent(Anything * 1 = Anything). But it is much more efficent and saves you from wasting a register to hold the pointer to the current element in the array.
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 25th Nov 2005 11:33
Ah, always figured it'd be something like that, but wasn't sure.
Regarding the EBP/ESP thing, turns out my compiler (MSVC2005) uses EBP as a general purpose register, therefore has to use ESP to access locals.

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Halo Man
19
Years of Service
User Offline
Joined: 5th Nov 2005
Location:
Posted: 7th Dec 2005 14:21 Edited at: 11th Dec 2005 03:18
Good luck to everyone!


C++ Programming Tutorial - http://www.cplusplus.com/doc/tutorial/
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 20th Dec 2005 05:16
Found a few more links that may be of interest.

Interpreter example:
http://en.wikipedia.org/wiki/Interpreter_(computing)#Example_of_a_simple_interpreter

Self-interpretation:
http://arxiv.org/html/cs.PL/0311032

Mostly Compiler Construction things:
http://www.angelfire.com/ar/CompiladoresUCSE/COMPILERS.html

A book? I hate book. Book is stupid.
(Formerly Yellow)
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 20th Dec 2005 11:16
heh, self-interpretation... neat. I wonder what that guy was taking when he made it.

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Three Score
20
Years of Service
User Offline
Joined: 18th Jun 2004
Location: behind you
Posted: 21st Dec 2005 17:57
thanks, that really helps
btw
I'm building a virtual pc though instead of a compiler but if you can write machine code then you can read it also

tutorials,programs,useful but simple php scripts, a place for code snipplets and more at
http://hackr83.0z0.co.uk
(still under construction)
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 21st Dec 2005 18:13
mind saying a bit more of your project? I'm making an emulator so the projects are a bit related. It is still related to the original topic as emulators are very similar to compilers.

WarBasic Scripting engine for DarkBasicPro
DC emulator code size: 14.3MB, 553,214 lines
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 9th Apr 2006 10:47
*Bump*

I have some news about my compiler that I'll post later at length. In the mean time, how is everyone doing with their respective projects.
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 9th Apr 2006 18:00
Progress is a bit slow unfortuantly on my compiler. However, I've begun working a bit with Lua, now that it's so availiable with DBP. I've actually also finished a parser in DBP quite a while ago. I'm going to work on translating it over to FreeBasic(very similar to qBasic, but faster and still growing) so I can work with it more in that language.



---------------------------------

Kind of off topic, but here's a little tool I made based off your shader tutorial that I wanted to give you credit for.
http://forum.thegamecreators.com/?m=forum_view&t=76210&b=5

A book? I hate book. Book is stupid.
(Formerly Yellow)
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 9th Apr 2006 23:10 Edited at: 9th Apr 2006 23:14
Good to see you guys haven't given up.

I totally changed all my original plans:
I'm working on an Assembler that reads an XML file which defines the instructions and the output (so depending on the XML, it can generate intel x86 code, or ARM code, etc.). Currently focused on ARM code, it can already output arithmetic instructions. Adding more instructions is (hopefully) just a matter of editting the configs.
When that's done, I'll have to make a linker. And after that, the actual compiler (aiming for a BASIC-like language).
This will take me a really long time though, I'm working on other things, and I have a job/classes to tend to. -_-

DC emulator code size: 9MB. Compiled: 4MB.
Overall Status: 20% done. CPU: 80% (no floats), RAM: 10%, GFX: 0%
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 13th Apr 2006 02:27
@MikeS

Interesting tool. I'm glad my tutorials could be of assistance.

@Everyone

Right now, I've just recently completed a mini-assembler. It outputs valid COFF object files and calls microsoft's linker to generate a valid executable. I'm thinking about writing a huge tutorial series that will cover how to make a compiler from the back end up. I could document the COFF object file format and how to output assembly to it.

Ultimately, I'd like to work from working assembler, to working psuedo-assembler code, to full-fledged BASIC compiler. I think I'll start documenting my progress so far and how I made out shortly. Would any of you guys be interested in anything like that?
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 13th Apr 2006 04:28
Sounds quite exciting Neophyte, and I really think that's the path to go in terms of compiler development. Documentation will also be very important, for you might have to restart(as I have numerous times), and that'll definitly help out a lot in your development. So of course, I'd be extremely interested(if you couldn't guess).



A book? I hate book. Book is stupid.
(Formerly Yellow)
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 13th Apr 2006 05:00
I would. There's more to making an executable than code generation so being able to generate COFF files is handy. Any info on other people's experience is always helpfull.

DC emulator code size: 9MB. Compiled: 4MB.
Overall Status: 20% done. CPU: 80% (no floats), RAM: 10%, GFX: 0%
The resurrected anarchist
19
Years of Service
User Offline
Joined: 5th Apr 2006
Location:
Posted: 13th Apr 2006 14:03
hmm, books on makin compilers, intersting, im a bit hung over, so ill check em out later

like wen i get bak from cafe! mmm greasy food
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 18th Jun 2006 02:42
lol

-----------------
Found a great link for those interested in Tokens and parsing.
http://users.skynet.be/wvdd2/Tokenizers/tokenizers.html

As for me, currently I'm writing a tokenizer in ANSI C. After that I'm going to take a look back at the executable generation links in this thread and have a go at that, or just link the code with a C compiler to be compiled(Hopefully the former though).

My number 1 goal for this summer is to finish this project, I can't believe it's almost been 2 years. How's everyone elses project going?



A book? I hate book. Book is stupid.
(Formerly Yellow)
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 20th Jun 2006 04:53
@MikeS

I really haven't worked a whole lot on the little assembler I have since last time. I've been either busy with work or have been working on my tutorial series for this project.

So far I have an introduction, tutorial 1, and tutorial 2 basically finished. I'm not sure I want to release them all just yet. I'd like to output a tutorial a week, but usually work or something else pops up and I really don't get around to working on the tutorials. So I'm trying to get a few done and then I'll release them one by one. Sort of give myself a buffer.

The whole process of revisiting the code I have and explaining why I structured it the way I structured it and what it does has lead me to make some minor revisions of it that have improved the clarity and functionality of the code. So I guess this tutorial writing is rather benificial.

My number 1 goal for this summer is to finish this project, I can't believe it's almost been 2 years.

I too can't believe that it has been almost 2 years. Seems like just yesterday I was digging up articles on compiler construction and posting links to them en masse.
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 23rd Jun 2006 20:09
Im still here people! lol

MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 11th Jul 2006 01:24
Some more links for those interested. These two are quite good ones worth looking at, especially if you're just getting into this kind of thing. Both are using BASIC code, so they could even be translated into DBP if necessary.

Full Basic Basic interpreter

Simple Compiler snippet by Mark Sibly



A book? I hate book. Book is stupid.
(Formerly Yellow)
Three Score
20
Years of Service
User Offline
Joined: 18th Jun 2004
Location: behind you
Posted: 11th Jul 2006 04:15
Neophyte, do you have to instructions part of your tutorial done?
if so, could you please email it to me: [href]mailto:hackr9483-AT-gmail.com[/href]

I'm attempting to make an emulator(only attempting 8086 for now but still (or well I was but when I started reading this thread you encouraged me to start working on it again)

JouleOS and friends
great thanks to http://galekus.com for FREE HOSTING!
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 9th Oct 2006 03:21
Update:

It appears that I'm finding it easier to advance the compiler than it is to advance my tutorial series. I've modified the compiler that I have to accept initalized data and I can get it to process a program that outputs a messagebox. Haven't progressed much on the tutorial front though.

I was intending to finish a few tutorials to give me a head start, but at the rate I'm going I might as well start releasing the ones I've finished now rather than later.
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 9th Oct 2006 03:21
Creating a Compiler - Introduction

As the title suggests, this post and the posts after it are going to be about creating a compiler. However, the series will have an unusal approach to the subject. Instead of starting with a BASIC language and working toward outputing machine code, we're going to start with a simple ASSEMBLY language which generates machine code and work our way toward a BASIC language. The reason for this is that outputing an object file with correct machine code is generally the hardest part of the process of making the compiler.

Many people can get the front end of a compiler working and make their own interpreter, but creating something that can output machine code and properly link it is a challenge that many fall short of completing. Consequently, it is my goal to start with the difficult part first and then work my backwards toward the easy part.

Now I won't be claiming that this is the best way of going about things or that my way of programing of the compiler is the most optimal way. In fact, I might wind up changing somethings as I go along and at the end of this series might even re-write these tutorials for better clarity in both code and instruction. I'm not entirely sure myself of the precise steps that I'll need to complete my task. This is very much a work in progress. But it's taken me almost forever to get this far so I think waiting till I complete a compiler fully is out of the question. Better now than never.

Here is a brief overview of how our assembler-soon-to-be-compiler will be structured:

The compiler will be broken up into 4 parts:
The Lexer
The Parser
The IR Generator
The Code Generator

The source file for a program goes into the Lexer which strips the source file of useless information and creates a linked list of all lexical tokens. That list is then sent to the parser which is divided into two parts: The Syntax Checker and The Semantic Checker. Loosely defined, the Syntax checker makes sure that the program is structured right. The Semantic Checker makes sure that the meaning of the program makes sense. For example, if the following code was parsed the syntax checker would throw an error:

This is because it is missing a key construct: a matching endif. If the following code was parsed the semantic checker would throw an error:

Although we have our missing endif in place, A = B$ is incorrect because you can't assign a string to an integer. The meaning does not make sense because the types are incompatiable. The Semantic Checker is the part of the compiler that catches incompatiable types.

Once the source code clears the parsing stage it is ready to be fed to the IR Generator. The IR Generator will transform the list of tokens into an Intermediate Representation. This is the format that our Backend will work with from here on out.

With our new Intermediate Representation of the program our Code Generator can get to work assembling the program and outputing a COFF object and sending it to a linker. This is also the phase where optimizations take place but this won't really be touch upon until much later since we'll be working with assembly in the beginning.

This complete's our brief introduction to our future compiler. Next post will contain a tutorial covering the code to one of our fundamental data structures: the Linked List.

Login to post a reply

Server time is: 2025-06-04 02:23:36
Your offset time is: 2025-06-04 02:23:36