Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

Geek Culture / Take on C ; (And all the great links on Compiliers)

Author
Message
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 29th Sep 2004 10:41
@MikeS

"Keep up the awesome work, and I really mean that."

Thanks for the encouragement.

I just started putting all of the code I have written so far for this tutorial into one big program. So far I have over 700 lines of code down in total.

I'll probably take some time out from working on the tutorial to post those links I found. Maybe tomorrow if I get the time.
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 29th Sep 2004 10:46
Hey, MikeS, I'm writing my interpretor in C++ too. How's it going? I've been working on mine for 4 or 5 months now, so it's almost done. I just need to add arrays (this is a pain) and more graphics commands.

PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 14th Oct 2004 05:26
sorry wrong thread

Create? Play? YOU Decide
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 17th Oct 2004 11:31
http://www.devincook.com/goldparser/index.htm

Another wonderful link I just uncovered today.



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 23rd Oct 2004 01:47
Certainly is... will download when i return home


Create? Play? YOU Decide
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 7th Nov 2004 21:50
Thanks Mike, that is very useful


Create? Play? YOU Decide
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 8th Nov 2004 01:14
No problem.



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 10th Nov 2004 07:24
Well it appears I owe an update on my tutorial.

Last I reported, the tutorial was at about 26 kb in size. Currently it is at about 88 kb in size. The hard-coding of the assembly program took a lot longer than I anticipated. It also took a lot more code than I had anticipated. There is about 1,500+ lines of code for the hard-code assembly program which I completed encoding a little while ago.

I've begun work on the assembling functions now and have completed the addressing mode byte assembler function and the MOV instruction assembler function. The MOV instruction assembler was probably the hardest one I'm going to face and the work on the addressing mode byte assembler has made my life considerably easier. Currently, I'm working on the LEA instruction assembler and should finish that by tonight.

After that, I have CALL, PUSH/POP, XOR, OR, CMP, and JUMP instructions to do. The CALL instruction should be the hardest of that lot with the JUMP instruction coming in second. The rest will probably be push overs.

It looks like I won't be able to finish this tutorial till christmas the way it is going currently, but I still have hope that I'll complete it before then.

After I finish this tutorial I'm probably going to go back to work on my compiler. I've decided that I won't be including the machine code output part of the initial version of the compiler. Originally, I had planned to use the functions I'm writing now to achieve just that. Unfortunately, I've recently realized that the encoding I am using is seriously flawed. Since all of these functions I'm writing are based around this encoding I'll probably have to rewrite them in varying degrees eventually.

In the mean time, I plan on constructing the compiler so that it outputs its results to a text file in MASM or FASM format(I haven't decided which). I'll be able to verify my early results and improve on them without the resorting to a disassembler. Later on I can add the machine-code generator if I feel like it.

Also, I still have those papers I've found on-line kicking around on my harddrive. Sooner or later I plan on posting links to them with a brief little description when I get the chance. Hopefully.
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 11th Nov 2004 04:54
Sounds good Neophyte. Glad to see you're making great progress.

Quote: "Currently it is at about 88 kb in size."

Many many lines.

----------------------------------------------
Just opened my compiler book back up to read some more for the first time in a couple months. Probebly end up trying to write another interpreter in C++ by January or so.



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 11th Nov 2004 15:24
which compiler book is that Mike?





(seems we have done well to get this post alive and kicking)


Create? Play? YOU Decide
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 12th Nov 2004 04:51
Dragon book, aka Compilers: Principles, Techniques, and Tools.

It's not visually pleasing to read, but it has some good information in it and structure to it. Definitly worth the buy if you can get it at a bargain price.



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 13th Nov 2004 01:14
price?


Create? Play? YOU Decide
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 1st Dec 2004 17:37
@MikeS:
How much is it?


Create? Play? YOU Decide
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 2nd Dec 2004 07:12
It cost me $50, but you'll have to find a copy yourself, thus varying prices.

Might be a good idea to google around and get a good deal.

Amazon:
http://www.amazon.com/exec/obidos/tg/detail/-/0201100886/qid=1101942707/sr=8-1/ref=pd_csp_1/102-6561822-2026537?v=glance&s=books&n=507846



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 3rd Dec 2004 03:07
thanks


Create? Play? YOU Decide
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 28th Dec 2004 16:05
How is it going?

Liberty: Fight for Freedom
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 28th Dec 2004 20:30
yeah, how is it going?
would any of you guys be interested in making some kind of dev-diary where we could post about language creation? That way we could probably help each other out by seeing what other people did.


Preston C
22
Years of Service
User Offline
Joined: 16th May 2003
Location: Penn State University Park
Posted: 28th Dec 2004 23:46 Edited at: 29th Dec 2004 00:09
That's a good idea. We could all just make a seperate forum for just that, posting about how our languages are going and what's up, we could help each other out (like you said), etc. It's a great idea.

Now we just need a forum...I could organize a part of my forum for such things, it's not like it's being used anyway

[Edit] Might be best if I got input on it first though, I mean, it'd really be a waste of time (however small) should I setup the forum and no one uses it.

Cheers,
Preston

Intel Celeron 1.3 Ghrz | 512 MB Ram | NVIDIA GeForce FX 5200 128MB
MSVC++ .Net 2003 | Wings3D | CharacterFX | Gimp v2.0
Prayne de crabug ahm rinedere be-yogt iglo kes gron
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 29th Dec 2004 00:56
Sounds good! Though, I'd like to hear from some of the other dudes too...


MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 29th Dec 2004 02:04
Yeah, sounds like a plan. Might actually get me to develop a little more if I were a bit more organized. Would also be a good idea if we took all the links from this thread and made an organized page. I could work on that I suppose, and I know I posted a post with all the links somewhere(Well, most of them).



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
Preston C
22
Years of Service
User Offline
Joined: 16th May 2003
Location: Penn State University Park
Posted: 29th Dec 2004 03:54
Well, if we were going to use my forum for it, it might be best to change plans. I was trying to upgrade my YABBSE installations to the new Simple Machines Forum, and well, let's just say it didn't go so well, and I don't have the time to reupload the backup. I'm also leaving for a friends house in about ten minutes, and I won't be back till late tomorow.

So, good luck with finding a new forum if you're impatient, and if you can't or decide to wait, I should have it fixed when I get back.

Cheers,
Preston

Intel Celeron 1.3 Ghrz | 512 MB Ram | NVIDIA GeForce FX 5200 128MB
MSVC++ .Net 2003 | Wings3D | CharacterFX | Gimp v2.0
Prayne de crabug ahm rinedere be-yogt iglo kes gron
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 29th Dec 2004 04:34
By me there's no problem, I can wait till tomorrow. If someone else wants to set up the board, then that's ok too.


Preston C
22
Years of Service
User Offline
Joined: 16th May 2003
Location: Penn State University Park
Posted: 30th Dec 2004 06:43 Edited at: 30th Dec 2004 07:09
All righty then, forum is fixed (re-uploaded the backup, glad I got that before my little experience with Simple Machines) and I have a category and three boards dedicated to it (Showcase, Help, and Dev Diary) and if there should be more, just suggest em, I'll be glad to add em in.

I'll be looking through this thread and Programming Discussion for other compiler related stuff and post it in one of the boards.

Anyway (Sorry for it being a bit slow):
http://www.ppcdesigns.com/yabbse/

Cheers,
Preston

[Edit] Damn Mike (and Neophyte), how many links did you post?!?!

Intel Celeron 1.3 Ghrz | 512 MB Ram | NVIDIA GeForce FX 5200 128MB
MSVC++ .Net 2003 | Wings3D | CharacterFX | Gimp v2.0
Prayne de crabug ahm rinedere be-yogt iglo kes gron
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 30th Dec 2004 07:51 Edited at: 30th Dec 2004 07:51
Haha, that's only like half of them.

You can be happy to know you've probebly got the biggest list of url's for compilers on the entire internet though.

Just registered now though. Thanks for setting up and hosting the boards Preston.



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
Preston C
22
Years of Service
User Offline
Joined: 16th May 2003
Location: Penn State University Park
Posted: 30th Dec 2004 08:20 Edited at: 30th Dec 2004 08:38
Quote: " Haha, that's only like half of them."


Dammit, that was what was on the first page, a bit less than half. Starting second page now though, now that I have more coffee.

[Edit] Ok, I have most of the links on Page 2 (that were not repeated) on the resource page, and I'm leaving the rest for later. Now to start on me own little language.

Quote: "You can be happy to know you've probebly got the biggest list of url's for compilers on the entire internet though."


Bah, google has more

Quote: "Just registered now though. Thanks for setting up and hosting the boards Preston."


No prob, just don't turn into a c-m-c 101 and we'll be happy.

Cheers,
Preston

Intel Celeron 1.3 Ghrz | 512 MB Ram | NVIDIA GeForce FX 5200 128MB
MSVC++ .Net 2003 | Wings3D | CharacterFX | Gimp v2.0
Prayne de crabug ahm rinedere be-yogt iglo kes gron
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 30th Dec 2004 09:19
Looks great! I'm registering right now.
<ad>
Concerning the language you're making, may I interest you with the Aphotic VM? Writing bytecode that it can run is easier than processor-specific OpCode. And it has plugin-support that is very similar to DBP's (but without the annoying stringtable).
</ad>




PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 23rd Jan 2005 19:10
that sounds interesting TK

Liberty: Fight for Freedom
David T
Retired Moderator
22
Years of Service
User Offline
Joined: 27th Aug 2002
Location: England
Posted: 23rd Jan 2005 19:20
Gravedigger!

Get 15 new commands, all the date / time commands left out of DBPro for free!
DOWNLOAD PLUGINS HERE: http://www.davidtattersall.me.uk/ and select "DarkBasic"
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 23rd Jan 2005 19:33
well im wondering when the next installment of the tutorial will surface

Liberty: Fight for Freedom
David T
Retired Moderator
22
Years of Service
User Offline
Joined: 27th Aug 2002
Location: England
Posted: 23rd Jan 2005 20:23
Mine?

Get 15 new commands, all the date / time commands left out of DBPro for free!
DOWNLOAD PLUGINS HERE: http://www.davidtattersall.me.uk/ and select "DarkBasic"
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 26th Jan 2005 03:43 Edited at: 26th Jan 2005 03:43
@David T

Mine?

No, probably mine.

@POWERSOFT

Bad news, Powersoft. I think I'll be cancelling the tutorial since I'm running into bugs that I just can't deal with. Combine that with the fact that the tutorial is kind of taking away time from me writing on my compiler I think I'll probably just have to end it. I'm not completely sure I'll do that yet though.

If I do I'll probably just post the second half, and if requested, the third as yet unfinished part.

Of course, I have another batch of links to interesting papers I've found coming soon so don't be too disappointed.
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 26th Jan 2005 08:43
NOOOOOoooooOOOOO!!!!!!!!
I was just waiting for the part of the tutorial where you would explain JMP/CALL opcodes! ME NEEDS THEM!!

AphoticVM status: 80% AphoticBasic status: 10%
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 26th Jan 2005 11:35
What part of the JMP/CALL opcodes do you need explained? Do you need how they are represented in binary format? I could easily explain that for you if you need it.
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 26th Jan 2005 22:16
yay! yeah, that's what I need. This will allow me to do a good amount of optimization to my VM.

AphoticVM status: 80% AphoticBasic status: 10%
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 27th Jan 2005 05:58
@TKF15H

I'll start working on that in a little while then. But first, the links.

@MikeS

I promised this a while ago but I've been a little busy (and a little lazy) lately. So better late than never I suppose.

A Formal, Language-Independent, and Compositional Approach to Interprocedural Control Dependence Analysis
http://citeseer.ist.psu.edu/600335.html

Discusses a formal approach to creating interprocedural dependency graphs. A bit on the advanced side, but interesting nonetheless.

An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms
http://citeseer.ist.psu.edu/kowarschik03overview.html

Caches are becoming more and more important as CPU speeds race on while memory speeds lag behind. This paper describes a number of advanced optimization techniques to improve cache performance.

Building a Control-flow Graph from Scheduled Assembly Code
http://citeseer.ist.psu.edu/cooper02building.html

This is more of an interesting topic than a usefull topic. It deals with, as the name implys, taking compiled assembly and uncompiling it into a control-graph.

Designing Intermediate Representations for Optimization
http://citeseer.ist.psu.edu/207376.html

This paper is paticularly instructive. It discusses the various design issues one might encounter when designing an Intermediate Representation. Recommended reading.

Instruction Scheduling Using Integer Programming
http://citeseer.ist.psu.edu/230478.html

This paper discusses optimal instruction scheduling using interger programming techniques. The technique is a bit time-consuming but it appears to produce good results.

Integer Programming Based Register Allocation
http://citeseer.ist.psu.edu/220772.html

I was considering integer programming as an alternative to graph-coloring(since graph-coloring works poorly on the IA-32). Later on, I found out that it was not easy to implement and slow to compile though it did produce good results. I've since changed plans and no longer want to rely on Integer Programming for register allocation, but you might decide differently so here is the paper.

Operator Strength Reduction
http://citeseer.ist.psu.edu/54279.html

Operator strength reduction is a basic optimization that takes costly(strong) operations and replaces them with cheap(weak) operations. This paper describes a simple algorithm to do just that and that was built for beginners in mind.

Practical Improvements to the Construction and Destruction of SSA From
http://citeseer.ist.psu.edu/193349.html

Describes some improvements to Static Single Assignment form. Usefull if you are going to use SSA form.

Value Driven Redundancy Elimination
http://citeseer.ist.psu.edu/66713.html

Advanced Optimization. Is a combination of Value Numbering(see below) and code motion.

Combining Analyses, Combining Optimizations
http://citeseer.ist.psu.edu/213053.html

Nifty super-optimization that combines Constant Propogation, Unreachable Code Elimination, Global Congruence Finding and Global Value Numbering. This is also worth quoting:

Quote: "In conjunction with global code motion, these peephole optimizations generate excellent code very quickly, a useful feature for compilers that stress compilation speed over code quality."


Might be worth looking into if, like me, you are worried about compilation speed.

Value Numbering
http://citeseer.ist.psu.edu/86621.html

Value Numbering assigns an identifying number to each expression. The numbers are assigned in such a way so that if two numbers match then it can be proven that the two expressions are equal for all program inputs.
This paper discusses some improvements to the heuristic.

Advanced Copy Propagation for Arrays
http://citeseer.ist.psu.edu/vanbroekhoven03advanced.html

Advanced Optimization. After an array is as signed, we can, under certain conditions, replace a read from this array by the righthand side of the assignment. If so, the intermediate assignment can be skipped. In case it becomes dead code, it can be eliminated.

Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures
[href]hpc.serc.iisc.ernet.in/papers/IEEETC03-GovEtAl.pdf[/href]

Method of order instructions so the minimal amount of registers is used. Related to instruction scheduling and register allocation. Looks interesting, but I'm unsure of how usefull it will be to us.

The Stratego Series

Program Transformation with Stratego_XT-Rules, Strategies, Tools, and Systems
[href]www.cs.uu.nl/~visser/ftp/Vis03-strategoxt.pdf[/href]

From the Abstract:
Quote: "Stratego/XT is a framework for the development of transformation
systems aiming to support a wide range of program transformations. The framework
consists of the transformation language Stratego and the XT collection of
transformation tools. Stratego is based on the paradigm of rewriting under the
control of programmable rewriting strategies. The XT tools provide facilities
for the infrastructure of transformation systems including parsing and prettyprinting.
The framework addresses the entire range of the development process;
from the specification of transformations to their composition into transformation
systems. This chapter gives an overview of the main ingredients involved in the
composition of transformation systems with Stratego/XT, where we distinguish
the abstraction levels of rules, strategies, tools, and systems."


Strategies for Source-to-Source Constant Propagation
http://citeseer.ist.psu.edu/olmos02strategies.html

Shows how data optimizations can be applied to abstract syntax trees using Stratego.

The Stratego Tutorial and The Stratego Reference Manual
http://citeseer.ist.psu.edu/56794.html
http://citeseer.ist.psu.edu/294659.html

These explain the stratego language. What it does, and how it does it.

Warm Fusion in Stratego - A Case Study in Generation of Program Transformation Systems
http://citeseer.ist.psu.edu/373123.html

Provides an example of Stratego in action on a non-trival problem. This is the last Stratego paper that I'll be mentioning for now. The following papers are unrelated to Stratego.

Catamorphic Approach to Program Analysis
http://citeseer.ist.psu.edu/561292.html

Proposes a new framework for program analyses that regards them as maximum marking problems. Looks interesting and may prove usefull. But it also looks kind of difficult to implement. I'll have to look into it more later.

Iterative-Free Program Analysis
http://citeseer.ist.psu.edu/601409.html

Proposes a new way to analyize a program using recursive-graph traversal instead of the much more common iterative method. Looks interesting, but seems a bit too advanced.

Tupling Calculation Eliminates Multiple Data Traversals
http://citeseer.ist.psu.edu/21786.html

When optimizing a program, it is quite common for you to have to traverse a data structure multiple times. This paper discribes the Tupling Calculation which is "a well known transformation tactic to obtain new efficient recursive functions by grouping some recursive functions into a tuple." It seems to be intended for functional and not imperative languages(C, Basic, etc.) so I don't know if it will be usefull to us or not.

Combined Code Motion and Register Allocation using the Value State Dependence Graph
http://www.cl.cam.ac.uk/~am/papers/cc03.ps.gz

This paper describes a method of combining the code motion and register allocation phases efficiently. This looks interesting however it uses a form of Graph Coloring as a register allocator so I don't know how suitable this will be for x86 development.

Early Control of Register Pressure for Software Pipelined Loops
http://citeseer.ist.psu.edu/615495.html

This focuses on register allocation in loops. It brings up some good points about how register allocation should be favored over instruction scheduling because the penelty for missed cache hits and loads is higher than poorly pipelined code. It uses RISC type machines as its example though so I don't know if this will translate to well over to CISC machines like the x86.

SIRA-Schedule Independent Register Allocation for Software Pipelining
http://citeseer.ist.psu.edu/605752.html

This paper also deals with register allocation in loops. Only this one uses integer programing to solve the problem.

Register Saturation in Data Dependence Graphs
http://citeseer.ist.psu.edu/touati00register.html

Yet another paper on register allocation and instruction scheduling. This one is rather long at about 106 pages though and goes into some detail. It basically describes how register constraints can be taken into account during the scheduling phase using multiple DAGs.

Optimal loop parallelization under register constraints
http://citeseer.ist.psu.edu/eisenbeis96optimal.html

Deals with the same topic as the above paper only it solves it with a linear integer programming solution.

LoRA: a Package for Loop Optimal Register Allocation
http://citeseer.ist.psu.edu/223965.html

Describes a package, LoRA, that implements several algorithms for trading the register pressure against the code size.

Interactive Compilation

Compiler Modifications to Support Interactive Compilation
http://citeseer.ist.psu.edu/471977.html

This is someone's thesis that describes the steps they had to take to create an interactive compiler that would allow the user to view and modify the program as it is compiled. This is a pretty interesting concept and I've been seriously considering implementing it. However, given the complexity of this, it would have to be a project that you undertake from scratch instead of trying to tack on the GUI to an existing compiler. You might be interested in trying something like this if you have the time and ambition. It is 64 pages total.

Graphical User Interface for Compiler Optimizations with Simple-SUIF
http://citeseer.ist.psu.edu/252813.html

Hugh technical report/thesis (138 pages). Discusses a GUI for a compiler. Quite similar to the paper above only it is meant more as a means to study the process of back-end optimization than modifying the result.

The FeedBack Compiler
http://citeseer.ist.psu.edu/17434.html

This is the last paper in my Interactive Compilation series. This paper is much shorter than the previous 2(it is 8 pages long). It focuses more on user feedback to improve certain optimizations.

Value Dependence Graphs- Representation Without Taxation
http://citeseer.ist.psu.edu/345322.html

From the abstract:
Quote: "The value dependence graph(VDG) sparse dataflow-like representation that simplifies program analysis and transformation. It is a functional representation that represents control flow as data flow and makes explicit all machine quantities, such as stores and I/O channels."

Looks interesting. The paper goes on to describe several optimizations and how they can exploit the power of the VDG.

Fusion-Based Register Allocation
http://citeseer.ist.psu.edu/386426.html

From the abstract:
Quote: "The register allocation phase of a compiler maps live ranges of a program to registers. If there are more candidates than there are physical registers, the register allocator must spill a live range (the home location is in memory) or split a live range (the live range occupies multiple locations). One of the challenges for a register allocator is to deal with spilling and splitting together. Fusion based register allocation uses the structure of the program to make splitting and spilling decisions, with the goal to move overhead operations to infrequently executed parts of a program.
The basic idea of fusion based register allocation is to build up the interference graph. Starting with some base region (e.g., a basic block, a loop), the register allocator adds basic blocks to the region and incrementally builds the interference graph. When there are more live ranges than registers, the register allocator selects live ranges to split; these live ranges are split along the edge that was most recently added to the region."


This is a novel approach to register allocation. Though I'm not too sure if it will be suitable for x86 use.

Issues in Register Allocation by Graph Coloring
http://citeseer.ist.psu.edu/105594.html

Describes many of the short-comings of graph coloring and three improvements to overcome them.

Call-Cost Directed Register Allocation
http://citeseer.ist.psu.edu/lueh97callcost.html

Takes the methods described in the paper above and use them to select the right kind of register for a live range. Not too sure that this is usefull.

Integer Linear Programming vs Graph-Based Methods in Code Generation
http://citeseer.ist.psu.edu/5186.html

Discusses the phase ordering problem and the integer programming approach in depth. Offers some approximations of the integer programming approach to speed up compile time.

Linear-Time Register Allocation for a Fixed Number Registers
http://citeseer.ist.psu.edu/2404.html

This paper is somewhat different from all the other register allocation papers. In this paper the goal is not to find the minimum number of registers needed to realize a given program without spilling, but can a program be realized without spilling with at least n registers. This paper claims that it can be solved in linear time as long as the program is well structured(i.e. goto free).

Printing Floating-Point Numbers Quickly and Accurately
http://citeseer.ist.psu.edu/28233.html

This doesn't have to do with compilers, but I thought I'd include it anyway in the off chance you might want to write your own floating-point printing routine.

A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs
http://citeseer.ist.psu.edu/15321.html

Novel register allocation technique using cyclic interval graphs. Might be of some use.

Register Allocation for Indirect Addressing in Loops
http://citeseer.ist.psu.edu/38080.html

Although it doesn't deal with x86 CPUs directly, it does deal with CISC machines. It also mentions several interesting papers that will probably be worth looking into later on. Not too relevant to x86, but usefull nonetheless in helping us find some interesting papers.

Combining Register Allocation and Instruction Scheduling
http://citeseer.ist.psu.edu/115521.html

Yet another method for combining register allocation with instruction scheduling. I'll leave it to you to judge its worth in light of the others.

Hardness and Algorithms for Local Register Allocation
http://citeseer.ist.psu.edu/54499.html

This paper deals with local register allocation, but without any instruction scheduling. It uses a novel integer programming algorithm to solve the problem quickly and efficiently. It is a bit complicated but it looks promising.

Register Allocation sans Coloring
http://citeseer.ist.psu.edu/42139.html

Implements a register allocator that doesn't use graph coloring. Its called fast allocation and is, well, fast. It is also remarkable simple. Definately worth taking a look.

The Power of Belady's Algorithm in Register Allocation for Long Basic Blocks
http://citeseer.ist.psu.edu/647675.html

From the abstract:
Quote: "Optimization techniques such as loop-unrolling and trace-scheduling can result in long straight-line codes. It is, however, unclear how well the register allocation algorithms of current compilers perform on these codes. Compilers may well have been optimized for human written codes, which are likely to have short basic blocks. To evaluate how the state of the art compilers behave on long straight-line codes, we wrote a compiler that implements the simple Belady’s MIN algorithm.
The main contribution of this paper is the evaluation of Belady’s MIN algorithm when used for register allocation for long straight-line codes. These codes were executed on a MIPS R12000 processor. Our results show that applications compiled using Belady’s MIN algorithm run faster than when compiled with the MIPSPro or GCC compiler. In particular, Fast Fourier Transforms (FFTs) of size 32 and 64 run 12% and 33% faster than when compiled using the MIPSPro compiler."


Understanding and Improving Register Assignment
http://citeseer.ist.psu.edu/355529.html

Discusses coordinating register allocation strategies with instruction scheduling.

Evaluation of Algorithms for Local Register Allocation
http://citeseer.ist.psu.edu/665874.html

Four different LRA algorithms are compared for code effeciency and execution time of the algroithm itself. Worth looking into.

Quality and Speed in Linear-Scan Register Allocation
http://citeseer.ist.psu.edu/91122.html

Finally, we get to this paper. After a lot of thought, Linear-Scan register allocation is the algorithm I'm going to use for my compiler, or at least the first implementation of it. It has that combination of speed of execution, simplicity of implementation, and quality of code that I've been looking for for quite a while. This thesis describes it in full. It is definately worth a look as it appears to be the algorithm to go for when writing x86 compilers.

Correction to 'Producing Good Code for the Case Statement'
http://citeseer.ist.psu.edu/kannan94correction.html

Single page document that describes a O(n^2) algorithm for splitting a case-statements jump table into a minimum number of subtables(of a given density). Short and informative.

Code Generation Techniques
http://citeseer.ist.psu.edu/proebsting92code.html

This 154 page thesis introduces an alternative to graph coloring called Probablisitc Register Allocation. It promises to find the optimal solution to the old phase ordering problem in linear(?) time. Also, it is quite simple fitting on a single page. Might prove promising.

Code Generation Techniques for Irregular Architectures
http://citeseer.ist.psu.edu/173331.html

Another paper that deals with the phase-ordering problem. This time it deals with irregular architectures and the need to quickly retarget compilers to new processors.

Code Scheduling and Optimization for a Superscalar x86 Microprocessor
http://citeseer.ist.psu.edu/281885.html(This is the actual page. Just d/l the pdf)

Mandatory reading. This is one of the few papers that I could find that deals with writing a compiler for the x86 architecture. It is a bit old unfortunately, but it is still relevant today. It goes into some depth on how to schedule code, create a good machine description language, etc. Very informative.

Effective Instruction Scheduling With Limited Registers
http://citeseer.ist.psu.edu/627129.html

Phase-ordering problem paper. It is about 134 pages long and was written quite recently(2001).

Experimental evaluation and improvements to linear scan register allocation
http://user.it.uu.se/~happi/publications/regalloc_spe.pdf

Mandatory reading. This one describes improvements to the linear scan register allocator to allow it to run on x86 hardware efficently. Very usefull. If you are thinking of implementing a linear-scan register allocator like I am than this paper is definately a must read.

Improvements to Linear Scan register allocation
[href]www.llvm.org/ProjectsWithLLVM/2004-Fall-CS426-LS.pdf[/href]

I actually found this one when I was making this post. It was released in 2004 April 1 so it is pretty recent. It mentions several of the papers that I've mentioned before and builds upon them. Definately worth looking into if you are planning on making a linear scan register allocator.

Formal Compiler Implementation in a Logical Framework
http://citeseer.ist.psu.edu/571905.html

In case you are having trouble designing and implementing a compiler I found this. It looks like it could be quite usefull.

From the abstract:
Quote: "In this paper, we present a new approach based on higher-order abstract syntax and term rewriting in a logical framework. All program transformations, from parsing to code generation, are cleanly isolated and specified as term rewrites. This has several advantages. The correctness of the compiler depends solely on a small set of rewrite rules that are written in a language of formal mathematics. In addition, the logical framework guarantees the preservation of scoping, and it automates many frequently occuring tasks including substitution and rewriting strategies. As we show, compiler development in a logical framework can be easier than in a general purpose language like ML, in part because of automation, and also because the framework provides extensive support for examination, validation, and debugging of the compiler transformations. This paper is organized around a case study, using the MetaPRL logical framework to compile an ML-like langauge to Intel x86 assembly. We also present a scoped formalization of x86 assembly in which all registers are immutable."


Linear Scan Register Allocation
http://citeseer.ist.psu.edu/poletto99linear.html

Another paper on the usefull linear scan register allocator.

Compilation of Floating Point Arithmetic in the High Performance Erlang Compiler
http://citeseer.ist.psu.edu/547273.html

Although I have a register allocator planned in the form of the linear-scan register allocator algorithm there is a slight problem. It only deals with integers. The pecularity of the stack architecture of the x87 floating-point processor means that some modification of the linear scan register allocator is needed to compile floating-point arithmetic. This paper describes these kind of modifications that were done on an actual x86 compiler for the Erlang language.

Reducing loads and stores in stack architectures
http://citeseer.ist.psu.edu/378471.html

Although not directly related to the x87, these stack optimizations might prove handy when dealing with it.

SMLNJ- Intel x86 back-end Compiler Controlled Memory
http://citeseer.ist.psu.edu/647811.html

From the abstract:
Quote: "This note describes the code generation algorithm used for the Intel x86, introduced in version 110.16. The standard Chaitin graph coloring register allocation cannot be used directly for machines with few registers, as all temporaries wind up being spilled, making for a poor allocation[Cha82]. Thus, for the x86, the conceptual model of the architecture has been extended with a set of memory locations that are treated as registers. The net effect is one where temporaries are computed into memory locations and passed as arguments to functions. The use of these memory locations managed in this way can be as fast as using registers, where the register allocation algorithm is indirectly taking the hardware register renaming mechanism into account."


Some notes on the new MLRISC x86 floating point code generator
http://citeseer.ist.psu.edu/682023.html

Describes an efficient scheme for generating floating-point arithmatic. I believe that this is the scheme that I'll be using for my floating-point register allocator. Mandatory reading.

Misc

They discuss interesting reseach papers on compiler optimizations
http://paramount.www.ecn.purdue.edu/ParaMount/reading.html

Code Specialization Literature Survey
http://www1.cs.columbia.edu/~locasto/projects/codespec/codespecialization.survey.html

Nifty page filled with examples of low-level optimizations for Pentiums. Must read.
http://www.goof.com/pcg/doc/pentium-optxref.html

Pdf link to a book about assemblers and loaders conviently titled Assemblers and Loaders
http://www.ecs.csun.edu/~dsalomon/assem.advertis/asl.pdf
Page that contains some info about said book can be found here
http://www.ecs.csun.edu/~dsalomon/assem.advertis/AssemAd.html

The PowerPC Compiler Writer's Guide
http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF7785256996007558C6

Not x86 related, but in case you every feel the urge to write a compiler for a Mac.
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 28th Jan 2005 10:29
Awesome Neophyte, great to see even more links!

I hope to catch up with my programming this weekend, so you've given me a good starting point.



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
Neophyte
22
Years of Service
User Offline
Joined: 23rd Feb 2003
Location: United States
Posted: 30th Jan 2005 15:11
@MikeS

Glad I could help.

@TKF15H

Okay here are the binary encodings for the JMP and CALL instructions. You can find them in Intel's Software Developer's Manual Vol. 2B.

JMP

JMP instructions use the "near" format in modern 32-bit processors.

direct - 1110 1001 [Next 4 bytes are pointer to jump to]

"Direct" is the most common encoding you'll find.

short - 1110 1011 [8 bit displacement]

"Short" is rarely seen, but it is good practice to use short whenever possible since it is 3 bytes smaller than "Direct". Short operands allow for more instructions to be decoded(and therefore executed) at a time.

CALL

CALL instructions use the "In the Same Segement" format in modern 32-bit processors.

direct - 1110 1000 [Next 4 bytes are pointer to jump to]

"Direct is the most common encoding you'll find and probably the only one you'll use.

Post if you need any more info or anything in the Intel volumes explained.
ionstream
20
Years of Service
User Offline
Joined: 4th Jul 2004
Location: Overweb
Posted: 31st Jan 2005 08:51
Wow, that manual is awesome! Now I can understand programs in binary! (not really).

I was always wondering how people made ASM assemblers, and now I know!

THXZ!!

TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 1st Feb 2005 06:59
Thanks Neo!
I'll have to experiment with this for a while. Since WB2 is a JITer, I'm going to have to link stuff on run-time. Weee! Hours of fun!

AphoticVM status: 80% AphoticBasic status: 10%
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 6th Mar 2005 17:16
This is a serious post, not a bump.

As many will know im writing a scripting engine. DavidT advised me to use RPN for my expression parsing. However that seems a very cumbersome method.

Which method of parsing would you recommend i look at?

Regards,
Rich

PowerScript: Currently Working on Expression Parsing, 10% complete (Yep the number went down)
David T
Retired Moderator
22
Years of Service
User Offline
Joined: 27th Aug 2002
Location: England
Posted: 6th Mar 2005 17:45 Edited at: 6th Mar 2005 17:50
Cumbersome? How so? I've always found it quite easy and simple - how ever I can imagine that with DB it'll be quite hard to code.

I suppose it works best if you're exporting to some form of intermediate language because then you don't need to do the maths yourself, just generate the icode.

[edit] Actually that's a point - have you thought about converting your scripts to an icode format first, then just running these from your main script prog? It'd be much faster and mean you only do a 'compile' (ie read through source in depth) once.

source.txt

|
|

[compiler]

|
|

icode.dat

|
|

DBProg.exe runs the icode instructions

Get 15 new commands, all the date / time commands left out of DBPro for free!
DOWNLOAD PLUGINS HERE: http://www.davidtattersall.me.uk/ and select "DarkBasic"
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 7th Mar 2005 00:27
how does "ICODE" work?

PowerScript: Currently Working on Expression Parsing, 10% complete (Yep the number went down)
David T
Retired Moderator
22
Years of Service
User Offline
Joined: 27th Aug 2002
Location: England
Posted: 7th Mar 2005 00:46
Well, convert your high level language into a simpler set of instructions. Then the dbpro program that actually reads the code and performs the actions can just read in some simple instructions.

For example if you had the line

percent = (45 / 50) * 100;

You could translate that to some simpler instructions like



That's a lot simpler and quicker to parse than the original code. This way, the intensive business of checking syntax and converting to icode (intermediate code) is only done once.

Get 15 new commands, all the date / time commands left out of DBPro for free!
DOWNLOAD PLUGINS HERE: http://www.davidtattersall.me.uk/ and select "DarkBasic"
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 14th Mar 2005 00:23
Just a little update:

Had some spare time yesterday, so I started working on an interpreter for DBP.

Here's an example script vs. the DBP equivelent.



On my interpreter, the speed loss is quite noticably for this example, but it's not too horrible. If I can optimize it to within 10 or 20 frames lost, rather than 50 or 60, I'd be quite happy.



A book? I hate book. Book is stupid.
(Formerly known as Yellow)
PowerSoft
20
Years of Service
User Offline
Joined: 10th Oct 2004
Location: United Kingdom
Posted: 17th Mar 2005 02:02
*back from france*

Hey that sounds good Mike

PowerScript: Currently Working on Expression Parsing, 10% complete (Yep the number went down)
MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 8th Jun 2005 02:50
Was researching operating systems today, and came upon another compiler link.

http://www.personal.kent.edu/~rmuhamma/Compilers/compiler.html

It's got a lot of source, and most of it's in C.



A book? I hate book. Book is stupid.
(Formerly Yellow)
TKF15H
21
Years of Service
User Offline
Joined: 20th Jul 2003
Location: Rio de Janeiro
Posted: 8th Jun 2005 09:12
Oh goody, finally some C code. The other dude's tutorial was Pascal. ARG.

MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 16th Jun 2005 13:04
Yup, always good to see some actual implementation. For the most part, it's a pretty small compiler too in terms of length of source.

Anyway, just bumping this up. How's your progress going on your compilers TKF15H, Neophyte, RichS, or anyone?

Summer break just started last week, so I went ahead and opened up the Dragon Book(Compilers, Principles, Techniques, and Tools.) I'm setting a goal to have a working compiler by the end of Summer(11 weeks ). Feels good to actually have some free time.



A book? I hate book. Book is stupid.
(Formerly Yellow)
Kevin Picone
22
Years of Service
User Offline
Joined: 27th Aug 2002
Location: Australia
Posted: 16th Jun 2005 15:57 Edited at: 11th Aug 2010 22:08
Jezz there's a lot of stuff in here, oh well too late now

Anyway, www.PlayBasic.com is going quite well, were about to get back to another compiler update (after the collision update is complete) since it's hasn't really be updated for around year. oh how time flies.

The update will most likely implement a few features that had been pushed down the priority list in favor of engine/command set stuff.. Mostly data structures / type related. There's a fair bit of work in what we've in mind.

Another addition will be the including support for pre-compiled code modules. We've already started to lay out the foundations of this, but nothings made it's way into PB release as yet. Ultimately this will allow us and users also to develop PlayBASIC command sets in native PlayBASIC, which they can supply as precompiled (closed source) modules. Aka, as object files for those assembler throws backs like myself.. ..

but anyway, no guessing what for happens next..

MikeS
Retired Moderator
22
Years of Service
User Offline
Joined: 2nd Dec 2002
Location: United States
Posted: 17th Jun 2005 07:58
Quote: "users also to develop PB command sets in native PB"


Sounds great Kevin, this would be an excellent feature to have.

Just downloaded PlayBasic for the first time a few weeks ago, and was very impressed with the language as a whole. Might have to start saving some $$ and buy the full version now.



A book? I hate book. Book is stupid.
(Formerly Yellow)

Login to post a reply

Server time is: 2025-06-04 02:21:02
Your offset time is: 2025-06-04 02:21:02