Run-time-omgivningar
Kursen Kompilatorer och interpretatorer |
Föreläsningar:
1
2
3
4
5
6
7
8
9
10
11
12
Det här är ungefär vad jag tänker säga på föreläsningen.
Använd det för förberedelser, repetition och ledning.
Det är inte en definition av kursinnehållet,
och det ersätter inte kursboken.
Idag: Run-time-omgivningar
7 Run-Time Environments
What happens when the program is executed.
Especially how names in the source program,
like a variable called x,
is translated to memory locations.
Några grundläggande begrepp
ALSU-07, bl a 1.6 Programming Language Basics och delar av kapitel 7
(ASU-86: 7.1 Source Language Issues)
- Procedure (including functions)
- Procedure definition
- Procedure name
- Procedure body
- Procedure call (Swedish: proceduranrop)
- Formal parameters = formal arguments = formals:
x and y in int f(int x, int y) { ... }
- Actual parameters = actual arguments = actuals
3 + 2 and n in f(3 + 2, n)
- Activation
- Activation record (Swedish: aktiveringspost) = frame
- Control stack
- Stack pointer, stack top (often kept in a register)
- Lifetime
- Recursive
- Control tree
- Scope (Swedish: räckvidd?)
- Local / non-local
- Data object
- Environment (Swedish: omgivning)
- Binding (of a name to a storage location)
7.1 Storage organization
ALSU-07 fig. 7.1 (ASU-86 fig. 7.7):
Typical subdivision of run-time memory into code and data areas.
The stack is cheaper than the heap.
Nyare: Stacken växer nerifrån, dvs från höga adresser mot lägre.
Växer stacken uppåt eller neråt på din dator? Testa med programmet
adresser.c:
#include <stdlib.h>
#include <stdio.h>
#include <limits.h>
int g;
void f(int n) {
int x;
printf("x i f(%d) finns på adressen %p = %lu\n",
n, (void*)&x, (long unsigned)&x);
if (n < 3)
f(n + 1);
}
int main(void) {
int x;
void* p;
printf("En void-pekare är %d bitar.\n",
sizeof(void*) * CHAR_BIT);
printf("x i main finns på adressen %p = %lu\n",
(void*)&x, (long unsigned)&x);
f(1);
printf("Den globala variabeln g finns på adressen %p = %lu\n",
(void*)&g, (long unsigned)&g);
p = malloc(100);
printf("Den malloc-allokerade arean p finns på adressen %p = %lu\n",
(void*)p, (long unsigned)p);
printf("Funktionen f finns på adressen %p = %lu\n",
(void*)f, (long unsigned)f);
return 0;
}
Resultat på en dator (en PC med Intel x86-processor):
En void-pekare är 32 bitar.
x i main finns på adressen 0xbf9d1c14 = 3214744596
x i f(1) finns på adressen 0xbf9d1bf4 = 3214744564
x i f(2) finns på adressen 0xbf9d1bd4 = 3214744532
x i f(3) finns på adressen 0xbf9d1bb4 = 3214744500
Den globala variabeln g finns på adressen 0x8049838 = 134518840
Den malloc-allokerade arean p finns på adressen 0x804a008 = 134520840
Funktionen f finns på adressen 0x80483c4 = 134513604
Resultat på en annan dator (en Sun med Sparc-processor):
En void-pekare är 32 bitar.
x i main finns på adressen ffbff964 = 4290771300
x i f(1) finns på adressen ffbff8ec = 4290771180
x i f(2) finns på adressen ffbff874 = 4290771060
x i f(3) finns på adressen ffbff7fc = 4290770940
Den globala variabeln g finns på adressen 20b4c = 133964
Den malloc-allokerade arean p finns på adressen 20b60 = 133984
Funktionen f finns på adressen 106cc = 67276
Activation records
ALSU-07 fig. 7.5
(ASU-86 fig. 7.8):
A general activation record.
Or, use registers for actual parameters and for the returned value.
Compile-time layout of local data
All this is determined at compile-time.
Each variable, such as the local variable i,
has a fixed offset in the activation record.
Fast to access, if a stack top pointer is kept in a register.
The compiler generates code to access i as (e. g.) SP - 24.
Tre sätt att allokera data
(= bestämma var variablerna ska finnas i minnet):
- Static allocation
- Stack allocation (a form of dynamic allocation)
- Heap allocation (a form of dynamic allocation)
Static allocation
The compiler can decide the size and address of each data object.
If activation records are static => recursion difficult
(i. e., manage your own stack in a static data area).
Example: FORTRAN.
(But don't learn the FORTRAN details in this chapter for the exam!)
7.2 Stack Allocation
ALSU-07 fig. 7.6 (ASU-86 fig. 7.13):
Downwards-growing stack allocation of activation records
Call sequence (Sw: anropskonventioner), return sequence
What should the caller (= the callING procedure) do,
and what should the callee (=the callED procedure) do?
E. g., allocated the activation record, save registers.
Various call sequences exist.
In general, let the callED do as much as possible.
(Just code generated for that once, instead of for each call to it!)
ALSU-07 fig. 7.7
(ASU-86 fig. 7.14):
Division of tasks between caller and callee.
Example:
- The callER evaluates actuals. (And stores them?)
- The callER stores a return address and the old SP in the new activation record,
and increments SP.
- The callED procedure saves registers and other status information.
- The callED procedure initializes local data + begins to execute.
The other way around when returning.
Variable-length data
Can be stored at the end of the activation record,
with just pointers to it among the local variables.
Dangling references
Possible with malloc + bad free,
but also with stack allocation.
Example:
char* getstr() {
char s[100];
fgets(s, sizeof s, stdin);
return s;
}
Heap allocation
malloc + free (or new + delete)
But also: activation records, if they have to be retained!
7.3 Access to non-local data on the stack
- Lexical scope = static scope
- Dynamic scope
1.6.6 Parameter passing
Can be skipped, but you may want to learn the difference between:
- Call by value
- Call by reference
- Copy-restore
- Call by name
- Macro expansion
(ASU-86 7.6) Symboltabeller
Very simple, from ASU-86 chapter 2:
struct entry {
char *lexptr;
int token;
};
#define SYMMAX 100
struct entry symtable[SYMMAX];
int lastentry = 0;
Additionally:
- Type information
- Size (in memory)
- Location (= where in memory) (unless we are generating assembly language)
- Several entries with the same lexeme!
(Example: many procedures can have a local variable named i.)
= scope!
"Built during analysis, used during synthesis,"
We make a difference between:
- identifer (such as x)
- name (such as the name x of this variable, and the name x of that procedure)
Operations to handle scope in a block-structured language:
- begin_scope()
- find_name(x) - in the current block, and then the next outside, etc
- add_name(x)
- end_scope()
We can use a stack! (Push each new scope.)
But hash tables are much more efficient.
7.4 Heap management
Om hur minnet på heapen ("den allmänna, dynamiska allolkeringsplatsen i processens adressrymd")
hanteras, med allokering och avallokering.
Datorns minneshierarki.
Lokalitet.
Fragmentering.
7.5-7.8 Skräpsamling
Läs i mån av intresse. Kursinnehållet finns i
The Very Basics of Garbage Collection.
Skräpsamling (eng: garbage collection)
The Very Basics of Garbage Collection
Några viktiga begrepp:
- Dynamically allocated memory (C: malloc, C++: new)
- Manual memory management (C: free, C++: delete)
- Garbage
- Why is manual memory management hard?
- Automatic garbage collection
- Live data / dead data
- Reachable data / unreachable data
- Root set
- Mark and sweep: mark bit, marking phase, sweep phase
- Reference counting
- Generational collection
- Incremental collection
- Conservative collection.
For C and C++:
Boehm
(http://www.hpl.hp.com/personal/Hans_Boehm/gc/)
Old program (grows and crashes):
#include <stdlib.h>
int main() {
while (1)
malloc(100);
return 0;
}
New program (works Ok):
#include <stdlib.h>
#include "gc.h"
int main() {
while (1)
GC_malloc(100);
return 0;
}
C++: Several ways. One is to inherit from class gc:
class A: public gc {...};
A* a = new A; // a is collectable.
Kursen Kompilatorer och interpretatorer |
Föreläsningar:
1
2
3
4
5
6
7
8
9
10
11
12
Thomas Padron-McCarthy
(Thomas.Padron-McCarthy@oru.se)
20 september 2007