Update to w3m-0.2.1-inu-1.6.
This commit is contained in:
		
							
								
								
									
										617
									
								
								gc/doc/README
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										617
									
								
								gc/doc/README
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,617 @@ | ||||
| Copyright (c) 1988, 1989 Hans-J. Boehm, Alan J. Demers | ||||
| Copyright (c) 1991-1996 by Xerox Corporation.  All rights reserved. | ||||
| Copyright (c) 1996-1999 by Silicon Graphics.  All rights reserved. | ||||
| Copyright (c) 1999-2001 by Hewlett-Packard Company. All rights reserved. | ||||
|  | ||||
| The file linux_threads.c is also | ||||
| Copyright (c) 1998 by Fergus Henderson.  All rights reserved. | ||||
|  | ||||
| The files Makefile.am, and configure.in are | ||||
| Copyright (c) 2001 by Red Hat Inc. All rights reserved. | ||||
|  | ||||
| The files config.guess and a few others are copyrighted by the Free | ||||
| Software Foundation. | ||||
|  | ||||
| THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED | ||||
| OR IMPLIED.  ANY USE IS AT YOUR OWN RISK. | ||||
|  | ||||
| Permission is hereby granted to use or copy this program | ||||
| for any purpose,  provided the above notices are retained on all copies. | ||||
| Permission to modify the code and to distribute modified code is granted, | ||||
| provided the above notices are retained, and a notice that the code was | ||||
| modified is included with the above copyright notice. | ||||
|  | ||||
| A few of the files needed to use the GNU-style build procedure come with | ||||
| slightly different licenses, though they are all similar in spirit.  A few | ||||
| are GPL'ed, but with an exception that should cover all uses in the | ||||
| collector.  (If you are concerned about such things, I recommend you look | ||||
| at the notice in config.guess or ltmain.sh.) | ||||
|  | ||||
| This is version 6.0 of a conservative garbage collector for C and C++. | ||||
|  | ||||
| You might find a more recent version of this at | ||||
|  | ||||
| http://www.hpl.hp.com/personal/Hans_Boehm/gc | ||||
|  | ||||
| OVERVIEW | ||||
|  | ||||
|     This is intended to be a general purpose, garbage collecting storage | ||||
| allocator.  The algorithms used are described in: | ||||
|  | ||||
| Boehm, H., and M. Weiser, "Garbage Collection in an Uncooperative Environment", | ||||
| Software Practice & Experience, September 1988, pp. 807-820. | ||||
|  | ||||
| Boehm, H., A. Demers, and S. Shenker, "Mostly Parallel Garbage Collection", | ||||
| Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design | ||||
| and Implementation, SIGPLAN Notices 26, 6 (June 1991), pp. 157-164. | ||||
|  | ||||
| Boehm, H., "Space Efficient Conservative Garbage Collection", Proceedings | ||||
| of the ACM SIGPLAN '91 Conference on Programming Language Design and | ||||
| Implementation, SIGPLAN Notices 28, 6 (June 1993), pp. 197-206. | ||||
|  | ||||
| Boehm H., "Reducing Garbage Collector Cache Misses", Proceedings of the | ||||
| 2000 International Symposium on Memory Management. | ||||
|  | ||||
|   Possible interactions between the collector and optimizing compilers are | ||||
| discussed in | ||||
|  | ||||
| Boehm, H., and D. Chase, "A Proposal for GC-safe C Compilation", | ||||
| The Journal of C Language Translation 4, 2 (December 1992). | ||||
|  | ||||
| and | ||||
|  | ||||
| Boehm H., "Simple GC-safe Compilation", Proceedings | ||||
| of the ACM SIGPLAN '96 Conference on Programming Language Design and | ||||
| Implementation. | ||||
|  | ||||
| (Some of these are also available from | ||||
| http://www.hpl.hp.com/personal/Hans_Boehm/papers/, among other places.) | ||||
|  | ||||
|   Unlike the collector described in the second reference, this collector | ||||
| operates either with the mutator stopped during the entire collection | ||||
| (default) or incrementally during allocations.  (The latter is supported | ||||
| on only a few machines.)  On the most common platforms, it can be built | ||||
| with or without thread support.  On a few platforms, it can take advantage | ||||
| of a multiprocessor to speed up garbage collection. | ||||
|  | ||||
|   Many of the ideas underlying the collector have previously been explored | ||||
| by others.  Notably, some of the run-time systems developed at Xerox PARC | ||||
| in the early 1980s conservatively scanned thread stacks to locate possible | ||||
| pointers (cf. Paul Rovner, "On Adding Garbage Collection and Runtime Types | ||||
| to a Strongly-Typed Statically Checked, Concurrent Language"  Xerox PARC | ||||
| CSL 84-7).  Doug McIlroy wrote a simpler fully conservative collector that | ||||
| was part of version 8 UNIX (tm), but appears to not have received | ||||
| widespread use. | ||||
|  | ||||
|   Rudimentary tools for use of the collector as a leak detector are included | ||||
| (see http://www.hpl.hp.com/personal/Hans_Boehm/gc/leak.html), | ||||
| as is a fairly sophisticated string package "cord" that makes use of the | ||||
| collector.  (See doc/README.cords and H.-J. Boehm, R. Atkinson, and M. Plass, | ||||
| "Ropes: An Alternative to Strings", Software Practice and Experience 25, 12 | ||||
| (December 1995), pp. 1315-1330.  This is very similar to the "rope" package | ||||
| in Xerox Cedar, or the "rope" package in the SGI STL or the g++ distribution.) | ||||
|  | ||||
| Further collector documantation can be found at | ||||
|  | ||||
| http://www.hpl.hp.com/personal/Hans_Boehm/gc | ||||
|  | ||||
|  | ||||
| GENERAL DESCRIPTION | ||||
|  | ||||
|   This is a garbage collecting storage allocator that is intended to be | ||||
| used as a plug-in replacement for C's malloc. | ||||
|  | ||||
|   Since the collector does not require pointers to be tagged, it does not | ||||
| attempt to ensure that all inaccessible storage is reclaimed.  However, | ||||
| in our experience, it is typically more successful at reclaiming unused | ||||
| memory than most C programs using explicit deallocation.  Unlike manually | ||||
| introduced leaks, the amount of unreclaimed memory typically stays | ||||
| bounded. | ||||
|  | ||||
|   In the following, an "object" is defined to be a region of memory allocated | ||||
| by the routines described below.   | ||||
|  | ||||
|   Any objects not intended to be collected must be pointed to either | ||||
| from other such accessible objects, or from the registers, | ||||
| stack, data, or statically allocated bss segments.  Pointers from | ||||
| the stack or registers may point to anywhere inside an object. | ||||
| The same is true for heap pointers if the collector is compiled with | ||||
|  ALL_INTERIOR_POINTERS defined, as is now the default. | ||||
|  | ||||
| Compiling without ALL_INTERIOR_POINTERS may reduce accidental retention | ||||
| of garbage objects, by requiring pointers from the heap to to the beginning | ||||
| of an object.  But this no longer appears to be a significant | ||||
| issue for most programs. | ||||
|  | ||||
| There are a number of routines which modify the pointer recognition | ||||
| algorithm.  GC_register_displacement allows certain interior pointers | ||||
| to be recognized even if ALL_INTERIOR_POINTERS is nor defined. | ||||
| GC_malloc_ignore_off_page allows some pointers into the middle of large objects | ||||
| to be disregarded, greatly reducing the probablility of accidental | ||||
| retention of large objects.  For most purposes it seems best to compile | ||||
| with ALL_INTERIOR_POINTERS and to use GC_malloc_ignore_off_page if | ||||
| you get collector warnings from allocations of very large objects. | ||||
| See README.debugging for details. | ||||
|  | ||||
|   WARNING: pointers inside memory allocated by the standard "malloc" are not | ||||
| seen by the garbage collector.  Thus objects pointed to only from such a | ||||
| region may be prematurely deallocated.  It is thus suggested that the | ||||
| standard "malloc" be used only for memory regions, such as I/O buffers, that | ||||
| are guaranteed not to contain pointers to garbage collectable memory. | ||||
| Pointers in C language automatic, static, or register variables, | ||||
| are correctly recognized.  (Note that GC_malloc_uncollectable has semantics | ||||
| similar to standard malloc, but allocates objects that are traced by the | ||||
| collector.) | ||||
|  | ||||
|   WARNING: the collector does not always know how to find pointers in data | ||||
| areas that are associated with dynamic libraries.  This is easy to | ||||
| remedy IF you know how to find those data areas on your operating | ||||
| system (see GC_add_roots).  Code for doing this under SunOS, IRIX 5.X and 6.X, | ||||
| HP/UX, Alpha OSF/1, Linux, and win32 is included and used by default.  (See | ||||
| README.win32 for win32 details.)  On other systems pointers from dynamic | ||||
| library data areas may not be considered by the collector. | ||||
| If you're writing a program that depends on the collector scanning | ||||
| dynamic library data areas, it may be a good idea to include at least | ||||
| one call to GC_is_visible() to ensure that those areas are visible | ||||
| to the collector. | ||||
|  | ||||
|   Note that the garbage collector does not need to be informed of shared | ||||
| read-only data.  However if the shared library mechanism can introduce | ||||
| discontiguous data areas that may contain pointers, then the collector does | ||||
| need to be informed. | ||||
|  | ||||
|   Signal processing for most signals may be deferred during collection, | ||||
| and during uninterruptible parts of the allocation process. | ||||
| Like standard ANSI C mallocs, by default it is unsafe to invoke | ||||
| malloc (and other GC routines) from a signal handler while another | ||||
| malloc call may be in progress. Removing -DNO_SIGNALS from Makefile | ||||
| attempts to remedy that.  But that may not be reliable with a compiler that | ||||
| substantially reorders memory operations inside GC_malloc. | ||||
|  | ||||
|   The allocator/collector can also be configured for thread-safe operation. | ||||
| (Full signal safety can also be achieved, but only at the cost of two system | ||||
| calls per malloc, which is usually unacceptable.) | ||||
| WARNING: the collector does not guarantee to scan thread-local storage | ||||
| (e.g. of the kind accessed with pthread_getspecific()).  The collector | ||||
| does scan thread stacks, though, so generally the best solution is to | ||||
| ensure that any pointers stored in thread-local storage are also | ||||
| stored on the thread's stack for the duration of their lifetime. | ||||
| (This is arguably a longstanding bug, but it hasn't been fixed yet.) | ||||
|  | ||||
| INSTALLATION AND PORTABILITY | ||||
|  | ||||
|   As distributed, the macro SILENT is defined in Makefile. | ||||
| In the event of problems, this can be removed to obtain a moderate | ||||
| amount of descriptive output for each collection. | ||||
| (The given statistics exhibit a few peculiarities. | ||||
| Things don't appear to add up for a variety of reasons, most notably | ||||
| fragmentation losses.  These are probably much more significant for the | ||||
| contrived program "test.c" than for your application.) | ||||
|  | ||||
|   Note that typing "make test" will automatically build the collector | ||||
| and then run setjmp_test and gctest. Setjmp_test will give you information | ||||
| about configuring the collector, which is useful primarily if you have | ||||
| a machine that's not already supported.  Gctest is a somewhat superficial | ||||
| test of collector functionality.  Failure is indicated by a core dump or | ||||
| a message to the effect that the collector is broken.  Gctest takes about  | ||||
| 35 seconds to run on a SPARCstation 2. It may use up to 8 MB of memory.  (The | ||||
| multi-threaded version will use more.  64-bit versions may use more.) | ||||
| "Make test" will also, as its last step, attempt to build and test the | ||||
| "cord" string library.  This will fail without an ANSI C compiler, but | ||||
| the garbage collector itself should still be usable. | ||||
|  | ||||
|   The Makefile will generate a library gc.a which you should link against. | ||||
| Typing "make cords" will add the cord library to gc.a. | ||||
| Note that this requires an ANSI C compiler. | ||||
|  | ||||
|   It is suggested that if you need to replace a piece of the collector | ||||
| (e.g. GC_mark_rts.c) you simply list your version ahead of gc.a on the | ||||
| ld command line, rather than replacing the one in gc.a.  (This will | ||||
| generate numerous warnings under some versions of AIX, but it still | ||||
| works.) | ||||
|  | ||||
|   All include files that need to be used by clients will be put in the | ||||
| include subdirectory.  (Normally this is just gc.h.  "Make cords" adds | ||||
| "cord.h" and "ec.h".) | ||||
|  | ||||
|   The collector currently is designed to run essentially unmodified on | ||||
| machines that use a flat 32-bit or 64-bit address space. | ||||
| That includes the vast majority of Workstations and X86 (X >= 3) PCs. | ||||
| (The list here was deleted because it was getting too long and constantly | ||||
| out of date.) | ||||
|   It does NOT run under plain 16-bit DOS or Windows 3.X.  There are however | ||||
| various packages (e.g. win32s, djgpp) that allow flat 32-bit address | ||||
| applications to run under those systemsif the have at least an 80386 processor, | ||||
| and several of those are compatible with the collector. | ||||
|  | ||||
|   In a few cases (Amiga, OS/2, Win32, MacOS) a separate makefile | ||||
| or equivalent is supplied.  Many of these have separate README.system | ||||
| files. | ||||
|  | ||||
|   Dynamic libraries are completely supported only under SunOS | ||||
| (and even that support is not functional on the last Sun 3 release), | ||||
| Linux, IRIX 5&6, HP-PA, Win32 (not Win32S) and OSF/1 on DEC AXP machines. | ||||
| On other machines we recommend that you do one of the following: | ||||
|  | ||||
|   1) Add dynamic library support (and send us the code). | ||||
|   2) Use static versions of the libraries. | ||||
|   3) Arrange for dynamic libraries to use the standard malloc. | ||||
|      This is still dangerous if the library stores a pointer to a | ||||
|      garbage collected object.  But nearly all standard interfaces | ||||
|      prohibit this, because they deal correctly with pointers | ||||
|      to stack allocated objects.  (Strtok is an exception.  Don't | ||||
|      use it.) | ||||
|  | ||||
|   In all cases we assume that pointer alignment is consistent with that | ||||
| enforced by the standard C compilers.  If you use a nonstandard compiler | ||||
| you may have to adjust the alignment parameters defined in gc_priv.h. | ||||
|  | ||||
|   A port to a machine that is not byte addressed, or does not use 32 bit | ||||
| or 64 bit addresses will require a major effort.  A port to plain MSDOS | ||||
| or win16 is hard. | ||||
|  | ||||
|   For machines not already mentioned, or for nonstandard compilers, the | ||||
| following are likely to require change: | ||||
|  | ||||
| 1.  The parameters in gcconfig.h. | ||||
|       The parameters that will usually require adjustment are | ||||
|    STACKBOTTOM,  ALIGNMENT and DATASTART.  Setjmp_test | ||||
|    prints its guesses of the first two. | ||||
|       DATASTART should be an expression for computing the | ||||
|    address of the beginning of the data segment.  This can often be | ||||
|    &etext.  But some memory management units require that there be | ||||
|    some unmapped space between the text and the data segment.  Thus | ||||
|    it may be more complicated.   On UNIX systems, this is rarely | ||||
|    documented.  But the adb "$m" command may be helpful.  (Note | ||||
|    that DATASTART will usually be a function of &etext.  Thus a | ||||
|    single experiment is usually insufficient.) | ||||
|      STACKBOTTOM is used to initialize GC_stackbottom, which | ||||
|    should be a sufficient approximation to the coldest stack address. | ||||
|    On some machines, it is difficult to obtain such a value that is | ||||
|    valid across a variety of MMUs, OS releases, etc.  A number of | ||||
|    alternatives exist for using the collector in spite of this.  See the | ||||
|    discussion in gcconfig.h immediately preceding the various | ||||
|    definitions of STACKBOTTOM. | ||||
|     | ||||
| 2.  mach_dep.c. | ||||
|       The most important routine here is one to mark from registers. | ||||
|     The distributed file includes a generic hack (based on setjmp) that | ||||
|     happens to work on many machines, and may work on yours.  Try | ||||
|     compiling and running setjmp_t.c to see whether it has a chance of | ||||
|     working.  (This is not correct C, so don't blame your compiler if it | ||||
|     doesn't work.  Based on limited experience, register window machines | ||||
|     are likely to cause trouble.  If your version of setjmp claims that | ||||
|     all accessible variables, including registers, have the value they | ||||
|     had at the time of the longjmp, it also will not work.  Vanilla 4.2 BSD | ||||
|     on Vaxen makes such a claim.  SunOS does not.) | ||||
|       If your compiler does not allow in-line assembly code, or if you prefer | ||||
|     not to use such a facility, mach_dep.c may be replaced by a .s file | ||||
|     (as we did for the MIPS machine and the PC/RT). | ||||
|       At this point enough architectures are supported by mach_dep.c | ||||
|     that you will rarely need to do more than adjust for assembler | ||||
|     syntax. | ||||
|  | ||||
| 3.  os_dep.c (and gc_priv.h). | ||||
|   	  Several kinds of operating system dependent routines reside here. | ||||
|   	Many are optional.  Several are invoked only through corresponding | ||||
|   	macros in gc_priv.h, which may also be redefined as appropriate. | ||||
|       The routine GC_register_data_segments is crucial.  It registers static | ||||
|     data areas that must be traversed by the collector. (User calls to | ||||
|     GC_add_roots may sometimes be used for similar effect.) | ||||
|       Routines to obtain memory from the OS also reside here. | ||||
|     Alternatively this can be done entirely by the macro GET_MEM | ||||
|     defined in gc_priv.h.  Routines to disable and reenable signals | ||||
|     also reside here if they are need by the macros DISABLE_SIGNALS | ||||
|     and ENABLE_SIGNALS defined in gc_priv.h. | ||||
|       In a multithreaded environment, the macros LOCK and UNLOCK | ||||
|     in gc_priv.h will need to be suitably redefined. | ||||
|       The incremental collector requires page dirty information, which | ||||
|     is acquired through routines defined in os_dep.c.  Unless directed | ||||
|     otherwise by gcconfig.h, these are implemented as stubs that simply | ||||
|     treat all pages as dirty.  (This of course makes the incremental | ||||
|     collector much less useful.) | ||||
|  | ||||
| 4.  dyn_load.c | ||||
| 	This provides a routine that allows the collector to scan data | ||||
| 	segments associated with dynamic libraries.  Often it is not | ||||
| 	necessary to provide this routine unless user-written dynamic | ||||
| 	libraries are used. | ||||
|  | ||||
|   For a different version of UN*X or different machines using the | ||||
| Motorola 68000, Vax, SPARC, 80386, NS 32000, PC/RT, or MIPS architecture, | ||||
| it should frequently suffice to change definitions in gcconfig.h. | ||||
|  | ||||
|  | ||||
| THE C INTERFACE TO THE ALLOCATOR | ||||
|  | ||||
|   The following routines are intended to be directly called by the user. | ||||
| Note that usually only GC_malloc is necessary.  GC_clear_roots and GC_add_roots | ||||
| calls may be required if the collector has to trace from nonstandard places | ||||
| (e.g. from dynamic library data areas on a machine on which the  | ||||
| collector doesn't already understand them.)  On some machines, it may | ||||
| be desirable to set GC_stacktop to a good approximation of the stack base.  | ||||
| (This enhances code portability on HP PA machines, since there is no | ||||
| good way for the collector to compute this value.)  Client code may include | ||||
| "gc.h", which defines all of the following, plus many others. | ||||
|  | ||||
| 1)  GC_malloc(nbytes) | ||||
|     - allocate an object of size nbytes.  Unlike malloc, the object is | ||||
|       cleared before being returned to the user.  Gc_malloc will | ||||
|       invoke the garbage collector when it determines this to be appropriate. | ||||
|       GC_malloc may return 0 if it is unable to acquire sufficient | ||||
|       space from the operating system.  This is the most probable | ||||
|       consequence of running out of space.  Other possible consequences | ||||
|       are that a function call will fail due to lack of stack space, | ||||
|       or that the collector will fail in other ways because it cannot | ||||
|       maintain its internal data structures, or that a crucial system | ||||
|       process will fail and take down the machine.  Most of these | ||||
|       possibilities are independent of the malloc implementation. | ||||
|  | ||||
| 2)  GC_malloc_atomic(nbytes) | ||||
|     - allocate an object of size nbytes that is guaranteed not to contain any | ||||
|       pointers.  The returned object is not guaranteed to be cleared. | ||||
|       (Can always be replaced by GC_malloc, but results in faster collection | ||||
|       times.  The collector will probably run faster if large character | ||||
|       arrays, etc. are allocated with GC_malloc_atomic than if they are | ||||
|       statically allocated.) | ||||
|  | ||||
| 3)  GC_realloc(object, new_size) | ||||
|     - change the size of object to be new_size.  Returns a pointer to the | ||||
|       new object, which may, or may not, be the same as the pointer to | ||||
|       the old object.  The new object is taken to be atomic iff the old one | ||||
|       was.  If the new object is composite and larger than the original object, | ||||
|       then the newly added bytes are cleared (we hope).  This is very likely | ||||
|       to allocate a new object, unless MERGE_SIZES is defined in gc_priv.h. | ||||
|       Even then, it is likely to recycle the old object only if the object | ||||
|       is grown in small additive increments (which, we claim, is generally bad | ||||
|       coding practice.) | ||||
|  | ||||
| 4)  GC_free(object) | ||||
|     - explicitly deallocate an object returned by GC_malloc or | ||||
|       GC_malloc_atomic.  Not necessary, but can be used to minimize | ||||
|       collections if performance is critical.  Probably a performance | ||||
|       loss for very small objects (<= 8 bytes). | ||||
|  | ||||
| 5)  GC_expand_hp(bytes) | ||||
|     - Explicitly increase the heap size.  (This is normally done automatically | ||||
|       if a garbage collection failed to GC_reclaim enough memory.  Explicit | ||||
|       calls to GC_expand_hp may prevent unnecessarily frequent collections at | ||||
|       program startup.) | ||||
|  | ||||
| 6)  GC_malloc_ignore_off_page(bytes) | ||||
| 	- identical to GC_malloc, but the client promises to keep a pointer to | ||||
| 	  the somewhere within the first 256 bytes of the object while it is | ||||
| 	  live.  (This pointer should nortmally be declared volatile to prevent | ||||
| 	  interference from compiler optimizations.)  This is the recommended | ||||
| 	  way to allocate anything that is likely to be larger than 100Kbytes | ||||
| 	  or so.  (GC_malloc may result in failure to reclaim such objects.) | ||||
|  | ||||
| 7)  GC_set_warn_proc(proc) | ||||
| 	- Can be used to redirect warnings from the collector.  Such warnings | ||||
| 	  should be rare, and should not be ignored during code development. | ||||
|        | ||||
| 8) GC_enable_incremental() | ||||
|     - Enables generational and incremental collection.  Useful for large | ||||
|       heaps on machines that provide access to page dirty information. | ||||
|       Some dirty bit implementations may interfere with debugging | ||||
|       (by catching address faults) and place restrictions on heap arguments | ||||
|       to system calls (since write faults inside a system call may not be | ||||
|       handled well). | ||||
|  | ||||
| 9) Several routines to allow for registration of finalization code. | ||||
|    User supplied finalization code may be invoked when an object becomes | ||||
|    unreachable.  To call (*f)(obj, x) when obj becomes inaccessible, use | ||||
| 	GC_register_finalizer(obj, f, x, 0, 0); | ||||
|    For more sophisticated uses, and for finalization ordering issues, | ||||
|    see gc.h. | ||||
|  | ||||
|   The global variable GC_free_space_divisor may be adjusted up from its | ||||
| default value of 4 to use less space and more collection time, or down for | ||||
| the opposite effect.  Setting it to 1 or 0 will effectively disable collections | ||||
| and cause all allocations to simply grow the heap. | ||||
|  | ||||
|   The variable GC_non_gc_bytes, which is normally 0, may be changed to reflect | ||||
| the amount of memory allocated by the above routines that should not be | ||||
| considered as a candidate for collection.  Careless use may, of course, result | ||||
| in excessive memory consumption. | ||||
|  | ||||
|   Some additional tuning is possible through the parameters defined | ||||
| near the top of gc_priv.h. | ||||
|    | ||||
|   If only GC_malloc is intended to be used, it might be appropriate to define: | ||||
|  | ||||
| #define malloc(n) GC_malloc(n) | ||||
| #define calloc(m,n) GC_malloc((m)*(n)) | ||||
|  | ||||
|   For small pieces of VERY allocation intensive code, gc_inl.h | ||||
| includes some allocation macros that may be used in place of GC_malloc | ||||
| and friends. | ||||
|  | ||||
|   All externally visible names in the garbage collector start with "GC_". | ||||
| To avoid name conflicts, client code should avoid this prefix, except when | ||||
| accessing garbage collector routines or variables. | ||||
|  | ||||
|   There are provisions for allocation with explicit type information. | ||||
| This is rarely necessary.  Details can be found in gc_typed.h. | ||||
|  | ||||
| THE C++ INTERFACE TO THE ALLOCATOR: | ||||
|  | ||||
|   The Ellis-Hull C++ interface to the collector is included in | ||||
| the collector distribution.  If you intend to use this, type | ||||
| "make c++" after the initial build of the collector is complete. | ||||
| See gc_cpp.h for the definition of the interface.  This interface | ||||
| tries to approximate the Ellis-Detlefs C++ garbage collection | ||||
| proposal without compiler changes. | ||||
|  | ||||
| Cautions: | ||||
| 1. Arrays allocated without new placement syntax are | ||||
| allocated as uncollectable objects.  They are traced by the | ||||
| collector, but will not be reclaimed. | ||||
|  | ||||
| 2. Failure to use "make c++" in combination with (1) will | ||||
| result in arrays allocated using the default new operator. | ||||
| This is likely to result in disaster without linker warnings. | ||||
|  | ||||
| 3. If your compiler supports an overloaded new[] operator, | ||||
| then gc_cpp.cc and gc_cpp.h should be suitably modified. | ||||
|  | ||||
| 4. Many current C++ compilers have deficiencies that | ||||
| break some of the functionality.  See the comments in gc_cpp.h | ||||
| for suggested workarounds. | ||||
|  | ||||
| USE AS LEAK DETECTOR: | ||||
|  | ||||
|   The collector may be used to track down leaks in C programs that are | ||||
| intended to run with malloc/free (e.g. code with extreme real-time or | ||||
| portability constraints).  To do so define FIND_LEAK in Makefile | ||||
| This will cause the collector to invoke the report_leak | ||||
| routine defined near the top of reclaim.c whenever an inaccessible | ||||
| object is found that has not been explicitly freed.  Such objects will | ||||
| also be automatically reclaimed. | ||||
|   Productive use of this facility normally involves redefining report_leak | ||||
| to do something more intelligent.  This typically requires annotating | ||||
| objects with additional information (e.g. creation time stack trace) that | ||||
| identifies their origin.  Such code is typically not very portable, and is | ||||
| not included here, except on SPARC machines. | ||||
|   If all objects are allocated with GC_DEBUG_MALLOC (see next section), | ||||
| then the default version of report_leak will report the source file | ||||
| and line number at which the leaked object was allocated.  This may | ||||
| sometimes be sufficient.  (On SPARC/SUNOS4 machines, it will also report | ||||
| a cryptic stack trace.  This can often be turned into a sympolic stack | ||||
| trace by invoking program "foo" with "callprocs foo".  Callprocs is | ||||
| a short shell script that invokes adb to expand program counter values | ||||
| to symbolic addresses.  It was largely supplied by Scott Schwartz.) | ||||
|   Note that the debugging facilities described in the next section can | ||||
| sometimes be slightly LESS effective in leak finding mode, since in | ||||
| leak finding mode, GC_debug_free actually results in reuse of the object. | ||||
| (Otherwise the object is simply marked invalid.)  Also note that the test | ||||
| program is not designed to run meaningfully in FIND_LEAK mode. | ||||
| Use "make gc.a" to build the collector. | ||||
|  | ||||
| DEBUGGING FACILITIES: | ||||
|  | ||||
|   The routines GC_debug_malloc, GC_debug_malloc_atomic, GC_debug_realloc, | ||||
| and GC_debug_free provide an alternate interface to the collector, which | ||||
| provides some help with memory overwrite errors, and the like. | ||||
| Objects allocated in this way are annotated with additional | ||||
| information.  Some of this information is checked during garbage | ||||
| collections, and detected inconsistencies are reported to stderr. | ||||
|  | ||||
|   Simple cases of writing past the end of an allocated object should | ||||
| be caught if the object is explicitly deallocated, or if the | ||||
| collector is invoked while the object is live.  The first deallocation | ||||
| of an object will clear the debugging info associated with an | ||||
| object, so accidentally repeated calls to GC_debug_free will report the | ||||
| deallocation of an object without debugging information.  Out of | ||||
| memory errors will be reported to stderr, in addition to returning | ||||
| NIL. | ||||
|  | ||||
|   GC_debug_malloc checking  during garbage collection is enabled | ||||
| with the first call to GC_debug_malloc.  This will result in some | ||||
| slowdown during collections.  If frequent heap checks are desired, | ||||
| this can be achieved by explicitly invoking GC_gcollect, e.g. from | ||||
| the debugger. | ||||
|  | ||||
|   GC_debug_malloc allocated objects should not be passed to GC_realloc | ||||
| or GC_free, and conversely.  It is however acceptable to allocate only | ||||
| some objects with GC_debug_malloc, and to use GC_malloc for other objects, | ||||
| provided the two pools are kept distinct.  In this case, there is a very | ||||
| low probablility that GC_malloc allocated objects may be misidentified as | ||||
| having been overwritten.  This should happen with probability at most | ||||
| one in 2**32.  This probability is zero if GC_debug_malloc is never called. | ||||
|  | ||||
|   GC_debug_malloc, GC_malloc_atomic, and GC_debug_realloc take two | ||||
| additional trailing arguments, a string and an integer.  These are not | ||||
| interpreted by the allocator.  They are stored in the object (the string is | ||||
| not copied).  If an error involving the object is detected, they are printed. | ||||
|  | ||||
|   The macros GC_MALLOC, GC_MALLOC_ATOMIC, GC_REALLOC, GC_FREE, and | ||||
| GC_REGISTER_FINALIZER are also provided.  These require the same arguments | ||||
| as the corresponding (nondebugging) routines.  If gc.h is included | ||||
| with GC_DEBUG defined, they call the debugging versions of these | ||||
| functions, passing the current file name and line number as the two | ||||
| extra arguments, where appropriate.  If gc.h is included without GC_DEBUG | ||||
| defined, then all these macros will instead be defined to their nondebugging | ||||
| equivalents.  (GC_REGISTER_FINALIZER is necessary, since pointers to | ||||
| objects with debugging information are really pointers to a displacement | ||||
| of 16 bytes form the object beginning, and some translation is necessary | ||||
| when finalization routines are invoked.  For details, about what's stored | ||||
| in the header, see the definition of the type oh in debug_malloc.c) | ||||
|  | ||||
| INCREMENTAL/GENERATIONAL COLLECTION: | ||||
|  | ||||
| The collector normally interrupts client code for the duration of  | ||||
| a garbage collection mark phase.  This may be unacceptable if interactive | ||||
| response is needed for programs with large heaps.  The collector | ||||
| can also run in a "generational" mode, in which it usually attempts to | ||||
| collect only objects allocated since the last garbage collection. | ||||
| Furthermore, in this mode, garbage collections run mostly incrementally, | ||||
| with a small amount of work performed in response to each of a large number of | ||||
| GC_malloc requests. | ||||
|  | ||||
| This mode is enabled by a call to GC_enable_incremental(). | ||||
|  | ||||
| Incremental and generational collection is effective in reducing | ||||
| pause times only if the collector has some way to tell which objects | ||||
| or pages have been recently modified.  The collector uses two sources | ||||
| of information: | ||||
|  | ||||
| 1. Information provided by the VM system.  This may be provided in | ||||
| one of several forms.  Under Solaris 2.X (and potentially under other | ||||
| similar systems) information on dirty pages can be read from the | ||||
| /proc file system.  Under other systems (currently SunOS4.X) it is | ||||
| possible to write-protect the heap, and catch the resulting faults. | ||||
| On these systems we require that system calls writing to the heap | ||||
| (other than read) be handled specially by client code. | ||||
| See os_dep.c for details. | ||||
|  | ||||
| 2. Information supplied by the programmer.  We define "stubborn" | ||||
| objects to be objects that are rarely changed.  Such an object | ||||
| can be allocated (and enabled for writing) with GC_malloc_stubborn. | ||||
| Once it has been initialized, the collector should be informed with | ||||
| a call to GC_end_stubborn_change.  Subsequent writes that store | ||||
| pointers into the object must be preceded by a call to | ||||
| GC_change_stubborn. | ||||
|  | ||||
| This mechanism performs best for objects that are written only for | ||||
| initialization, and such that only one stubborn object is writable | ||||
| at once.  It is typically not worth using for short-lived | ||||
| objects.  Stubborn objects are treated less efficiently than pointerfree | ||||
| (atomic) objects. | ||||
|  | ||||
| A rough rule of thumb is that, in the absence of VM information, garbage | ||||
| collection pauses are proportional to the amount of pointerful storage | ||||
| plus the amount of modified "stubborn" storage that is reachable during | ||||
| the collection.   | ||||
|  | ||||
| Initial allocation of stubborn objects takes longer than allocation | ||||
| of other objects, since other data structures need to be maintained. | ||||
|  | ||||
| We recommend against random use of stubborn objects in client | ||||
| code, since bugs caused by inappropriate writes to stubborn objects | ||||
| are likely to be very infrequently observed and hard to trace.   | ||||
| However, their use may be appropriate in a few carefully written | ||||
| library routines that do not make the objects themselves available | ||||
| for writing by client code. | ||||
|  | ||||
|  | ||||
| BUGS: | ||||
|  | ||||
|   Any memory that does not have a recognizable pointer to it will be | ||||
| reclaimed.  Exclusive-or'ing forward and backward links in a list | ||||
| doesn't cut it. | ||||
|   Some C optimizers may lose the last undisguised pointer to a memory | ||||
| object as a consequence of clever optimizations.  This has almost | ||||
| never been observed in practice.  Send mail to boehm@acm.org | ||||
| for suggestions on how to fix your compiler. | ||||
|   This is not a real-time collector.  In the standard configuration, | ||||
| percentage of time required for collection should be constant across | ||||
| heap sizes.  But collection pauses will increase for larger heaps. | ||||
| (On SPARCstation 2s collection times will be on the order of 300 msecs | ||||
| per MB of accessible memory that needs to be scanned.  Your mileage | ||||
| may vary.)  The incremental/generational collection facility helps, | ||||
| but is portable only if "stubborn" allocation is used. | ||||
|   Please address bug reports to boehm@acm.org.  If you are | ||||
| contemplating a major addition, you might also send mail to ask whether | ||||
| it's already been done (or whether we tried and discarded it). | ||||
|  | ||||
		Reference in New Issue
	
	Block a user