• Home
  • History
  • Annotate
  • only in /memtest86+/
NameDateSize

..16-Mar-201612 KiB

bootsect.S23-Jan-20118 KiB

changelog23-Jan-2011292

config.c23-Jan-201113.4 KiB

config.h23-Jan-20111.4 KiB

controller.c23-Jan-201197.9 KiB

controller.h23-Jan-2011177

cpuid.c23-Jan-20116.8 KiB

cpuid.h23-Jan-20116.1 KiB

defs.h23-Jan-2011989

dmi.c23-Jan-20117.4 KiB

dmi.h23-Jan-2011124

elf.h23-Jan-201123.4 KiB

error.c23-Jan-201112.7 KiB

extra.c23-Jan-201125.4 KiB

extra.h23-Jan-2011865

FAQ23-Jan-201115.6 KiB

head.S23-Jan-201122.3 KiB

init.c23-Jan-201134.5 KiB

io.h23-Jan-20113.4 KiB

jedec_id.h23-Jan-201128.2 KiB

lib.c23-Jan-201123.5 KiB

linuxbios.c23-Jan-20113.8 KiB

linuxbios_tables.h23-Jan-20112.6 KiB

main.c23-Jan-201115.6 KiB

makedos.sh23-Jan-2011282

Makefile23-Jan-20111.8 KiB

makeiso.sh23-Jan-20111.2 KiB

memsize.c23-Jan-201112.6 KiB

memtest.bin.lds23-Jan-2011246

memtest.lds23-Jan-2011141

memtest_shared.lds23-Jan-2011975

msr.h23-Jan-20113.3 KiB

mt86+_loader23-Jan-2011784

mt86+_loader.asm23-Jan-20116.3 KiB

patn.c23-Jan-20113.7 KiB

pci.c23-Jan-20114.3 KiB

pci.h23-Jan-20114.1 KiB

precomp.bin23-Jan-2011160.6 KiB

random.c23-Jan-20111,004

README23-Jan-201121.7 KiB

README.build-process23-Jan-20111.9 KiB

reloc.c23-Jan-20117.4 KiB

screen_buffer.c23-Jan-20112.4 KiB

screen_buffer.h23-Jan-2011768

serial.h23-Jan-20116.1 KiB

setup.S23-Jan-20113.2 KiB

smp.c23-Jan-201110.1 KiB

smp.h23-Jan-20116.4 KiB

spd.c23-Jan-201112.1 KiB

spd.h23-Jan-2011410

stdint.h23-Jan-20111.4 KiB

test.c23-Jan-201128.1 KiB

test.h23-Jan-20118.3 KiB

README

1Memtest86++ v4.20
2====================
3
4Table of Contents
5=================
6  1) Introduction
7  2) Licensing
8  3) Installation
9  4) Serial Port Console
10  5) Online Commands
11  6) Memory Sizing
12  7) Error Display
13  8) Trouble-shooting Memory Errors
14  9) Execution Time
15 10) Memory Testing Philosophy
16 11) Memtest86+ Test Algorithms
17 12) Individual Test Descriptions
18 13) Problem Reporting - Contact Information
19 14) Known Problems
20
21
22
231) Introduction
24===============
25Memtest86+ is thorough, stand alone memory test for Intel/AMD x86 architecture
26systems.  BIOS based memory tests are only a quick check and often miss
27failures that are detected by Memtest86+.
28
29For updates go to the Memtest86+ web page:
30
31	http://www.memtest.org
32
33
342) Licensing
35============
36Memtest86+ is released under the terms of the Gnu Public License (GPL). Other
37than the provisions of the GPL there are no restrictions for use, private or
38commercial.  See: http://www.gnu.org/licenses/gpl.html for details.
39Explicit permission for inclusion of Memtest86+ in software compilations and
40publications is hereby granted.
41
42
433) Installation (Linux Only)
44============================
45Memtest86+ is a stand alone program and can be loaded from either a disk
46partition or from a floppy disk.
47
48To build Memtest86+:
49   1) Review the Makefile and adjust options as needed.
50   2) Type "make"
51
52This creates a file named "memtest.bin" which is a bootable image.  This
53image file may be copied to a floppy disk or lilo may be used to boot this
54image from a hard disk partition.
55
56To create a Memtest86+ bootdisk
57   1) Insert a blank write enabled floppy disk.
58   2) As root, Type "make install"
59
60To boot from a disk partition via lilo
61   1) Copy the image file to a permanent location (ie. /memtest).
62   2) Add an entry in the lilo config file (usually /etc/lilo.conf) to boot
63      Memtest86+.  Only the image and label fields need to be specified. 
64      The following is a sample lilo entry for booting Memtest86+:
65
66	image = /memtest
67	label = memtest
68
69   3) As root,  type "lilo"
70
71      At the lilo prompt enter memtest to boot Memtest86+.
72
73If you encounter build problems a binary image has been included (precomp.bin).
74To create a boot-disk with this pre-built image do the following:
75   1) Insert a blank write enabled floppy disk.
76   2) Type "make install-precomp"
77
78
794) Serial Console
80=================
81Memtest86+ can be used on PC's equipped with a serial port for the console.
82By default serial port console support is not enabled since it slows
83down testing.  To enable change the SERIAL_CONSOLE_DEFAULT define in
84config.h from a zero to a one.  The serial console baud rate may also
85be set in config.h with the SERIAL_BAUD_RATE define.  The other serial
86port settings are no parity, 8 data bits, 1 stop bit.  All of the features
87used by Memtest86+ are accessible via the serial console.  However, the
88screen sometimes is garbled when the online commands are used.
89
90
915) Online Commands
92==================
93Memtest86+ has a limited number of online commands.  Online commands
94provide control over caching, test selection, address range and error
95scrolling.  A help bar is displayed at the bottom of the screen listing
96the available on-line commands. 
97
98  Command  Description
99
100  ESC   Exits the test and does a warm restart via the BIOS.
101
102  c     Enters test configuration menu
103	    Menu options are:
104               1) Cache mode
105               2) Test selection
106	       3) Address Range
107	       4) Memory Sizing
108	       5) Error Summary
109	       6) Error Report Mode 
110	       7) ECC Mode
111	       8) Restart 
112	       9) Adv. Options
113
114  SP    Set scroll lock (Stops scrolling of error messages)
115	Note: Testing is stalled when the scroll lock is
116	set and the scroll region is full.
117
118  CR    Clear scroll lock (Enables error message scrolling)
119
120
1216) Memory Sizing
122================
123The BIOS in modern PC's will often reserve several sections of memory for
124it's use and also to communicate information to the operating system (ie.
125ACPI tables).  It is just as important to test these reserved memory blocks
126as it is for the remainder of memory.  For proper operation all of memory
127needs to function properly regardless of what the eventual use is.  For
128this reason Memtest86+ has been designed to test as much memory as is
129possible.
130 
131However, safely and reliably detecting all of the available memory has been
132problematic.  Versions of Memtest86+ prior to v0.91 would probe to find where
133memory is. This works for the vast majority of motherboards but is not 100%
134reliable. Sometimes the memory size is incorrect and worse probing the wrong
135places can in some cases cause the test to hang or crash.  
136
137Starting in version 0.91 alternative methods are available for determining the
138memory size. By default the test attempts to get the memory size from the
139BIOS using the "e820" method.  With "e820" the BIOS provides a table of memory
140segments and identifies what they will be used for.  By default Memtest86+
141will test all of the ram marked as available and also the area reserved for
142the ACPI tables.  This is safe since the test does not use the ACPI tables
143and the "e820" specifications state that this memory may be reused after the
144tables have been copied.  Although this is a safe default some memory will
145not be tested.
146
147Two additional options are available through online configuration options.
148The first option (BIOS-All) also uses the "e820" method to obtain a memory
149map.  However, when this option is selected all of the reserved memory
150segments are tested, regardless of what their intended use is.  The only
151exception is memory segments that begin above 3gb.  Testing has shown that
152these segments are typically not safe to test.  The BIOS-All option is more
153thorough but could be unstable with some motherboards.
154
155The second option for memory sizing is the traditional "Probe" method.
156This is a very thorough but not entirely safe method.  In the majority of
157cases the BIOS-All and Probe methods will return the same memory map.
158
159For older BIOS's that do not support the "e820" method there are two
160additional methods (e801 and e88) for getting the memory size from the
161BIOS.  These methods only provide the amount of extended memory that is
162available, not a memory table.  When the e801 and e88 methods are used
163the BIOS-All option will not be available.
164
165The MemMap field on the display shows what memory size method is in use.
166Also the RsvdMem field shows how much memory is reserved and is not being
167tested.
168
169
1707) Error Information
171======================
172Memtest has two options for reporting errors.  The default is to report
173individual errors.  In BadRAM Patterns mode patterns are created for
174use with the Linux BadRAM feature.  This slick feature allows Linux to
175avoid bad memory pages.  Details about the BadRAM feature can be found at:
176
177	http://home.zonnet.nl/vanrein/badram
178
179For individual errors the following information is displayed when a memory
180error is detected.  An error message is only displayed for errors with a
181different address or failing bit pattern.  All displayed values are in
182hexadecimal.
183
184  Tst:			Test number
185  Failing Address :	Failing memory address 
186  Good:			Expected data pattern 
187  Bad:			Failing data pattern 
188  Err-Bits:		Exclusive or of good and bad data (this shows the
189			position of the failing bit(s))
190  Count:		Number of consecutive errors with the same address
191			and failing bits
192
193In BadRAM Patterns mode, Lines are printed in a form badram=F1,M1,F2,M2.
194In each F/M pair, the F represents a fault address, and the corresponding M
195is a bitmask for that address. These patterns state that faults have
196occurred in addresses that equal F on all "1" bits in M. Such a pattern may
197capture more errors that actually exist, but at least all the errors are
198captured. These patterns have been designed to capture regular patterns of
199errors caused by the hardware structure in a terse syntax.
200
201The BadRAM patterns are `grown' increment-ally rather than `designed' from an
202overview of all errors. The number of pairs is constrained to five for a
203number of practical reasons. As a result, handcrafting patterns from the
204output in address printing mode may, in exceptional cases, yield better
205results.
206
207
2088) Trouble-shooting Memory Errors
209================================
210Please be aware that not all errors reported by Memtest86+ are due to
211bad memory. The test implicitly tests the CPU, L1 and L2 caches as well as
212the motherboard.  It is impossible for the test to determine what causes
213the failure to occur.  Most failures will be due to a problem with memory.
214When it is not, the only option is to replace parts until the failure is
215corrected.  
216
217Once a memory error has been detected, determining the failing
218module is not a clear cut procedure.  With the large number of motherboard
219vendors and possible combinations of simm slots it would be difficult if
220not impossible to assemble complete information about how a particular
221error would map to a failing memory module.  However, there are steps
222that may be taken to determine the failing module.  Here are three
223techniques that you may wish to use:
224
2251) Removing modules
226This is simplest method for isolating a failing modules, but may only be
227employed when one or more modules can be removed from the system.  By
228selectively removing modules from the system and then running the test
229you will be able to find the bad module(s).  Be sure to note exactly which
230modules are in the system when the test passes and when the test fails.
231
2322) Rotating modules
233When none of the modules can be removed then you may wish to rotate modules
234to find the failing one.  This technique can only be used if there are
235three or more modules in the system.  Change the location of two modules
236at a time.  For example put the module from slot 1 into slot 2 and put
237the module from slot 2 in slot 1.  Run the test and if either the failing
238bit or address changes then you know that the failing module is one of the
239ones just moved. By using several combinations of module movement you
240should be able to determine which module is failing.
241
2423) Replacing modules
243If you are unable to use either of the previous techniques then you are
244left to selective replacement of modules to find the failure.  
245
2464) Avoiding allocation
247The printing mode for BadRAM patterns is intended to construct boot time
248parameters for a Linux kernel that is compiled with BadRAM support. This
249work-around makes it possible for Linux to reliably run on defective
250RAM.  For more information on BadRAM support
251for Linux, sail to
252
253       http://home.zonnet.nl/vanrein/badram
254
255Sometimes memory errors show up due to component incompatibility.  A memory
256module may work fine in one system and not in another.  This is not
257uncommon and is a source of confusion.  The components are not necessarily
258bad but certain combinations may need to be avoided.
259
260I am often asked about the reliability of errors reported by Memtest86+.
261In the vast majority of cases errors reported by the test are valid.
262There are some systems that cause Memtest86+ to be confused about the size of
263memory and it will try to test non-existent memory.  This will cause a large
264number of consecutive addresses to be reported as bad and generally there
265will be many bits in error.  If you have a relatively small number of
266failing addresses and only one or two bits in error you can be certain
267that the errors are valid.  Also intermittent errors are always valid.
268
269All valid memory errors should be corrected.  It is possible that a
270particular error will never show up in normal operation. However, operating
271with marginal memory is risky and can result in data loss and even
272disk corruption.  You can be sure that Murphy will get you if you know
273about a memory error and ignore it.
274
275Memtest86+ can not diagnose many types of PC failures.  For example a
276faulty CPU that causes Windows to crash will most likely just cause
277Memtest86+ to crash in the same way.
278
279
2809) Execution Time
281==================
282The time required for a complete pass of Memtest86+ will vary greatly
283depending on CPU speed, memory speed and memory size. Memtest86+ executes 
284indefinitely.  The pass counter increments each time that all of the 
285selected tests have been run.  Generally a single pass is sufficient to 
286catch all but the most obscure errors. However, for complete confidence 
287when intermittent errors are suspected testing for a longer period is advised.
288
28910) Memory Testing Philosophy
290=============================
291There are many good approaches for testing memory.  However, many tests
292simply throw some patterns at memory without much thought or knowledge
293of memory architecture or how errors can best be detected. This
294works fine for hard memory failures but does little to find intermittent
295errors. BIOS based memory tests are useless for finding intermittent
296memory errors.
297
298Memory chips consist of a large array of tightly packed memory cells,
299one for each bit of data.  The vast majority of the intermittent failures
300are a result of interaction between these memory cells.  Often writing a
301memory cell can cause one of the adjacent cells to be written with the
302same data. An effective memory test attempts to test for this
303condition. Therefore, an ideal strategy for testing memory would be
304the following:
305
306  1) write a cell with a zero
307  2) write all of the adjacent cells with a one, one or more times
308  3) check that the first cell still has a zero
309
310It should be obvious that this strategy requires an exact knowledge
311of how the memory cells are laid out on the chip.  In addition there is a
312never ending number of possible chip layouts for different chip types
313and manufacturers making this strategy impractical.  However, there
314are testing algorithms that can approximate this ideal strategy. 
315
316
31711) Memtest86+ Test Algorithms
318=============================
319Memtest86+ uses two algorithms that provide a reasonable approximation
320of the ideal test strategy above.  The first of these strategies is called
321moving inversions.  The moving inversion test works as follows:
322
323  1) Fill memory with a pattern
324  2) Starting at the lowest address
325	2a check that the pattern has not changed
326	2b write the patterns complement
327	2c increment the address
328	repeat 2a - 2c
329  3) Starting at the highest address
330	3a check that the pattern has not changed
331	3b write the patterns complement
332	3c decrement the address
333	repeat 3a - 3c
334
335This algorithm is a good approximation of an ideal memory test but
336there are some limitations.  Most high density chips today store data
3374 to 16 bits wide.  With chips that are more than one bit wide it
338is impossible to selectively read or write just one bit.  This means
339that we cannot guarantee that all adjacent cells have been tested
340for interaction.  In this case the best we can do is to use some
341patterns to insure that all adjacent cells have at least been written
342with all possible one and zero combinations.
343
344It can also be seen that caching, buffering and out of order execution
345will interfere with the moving inversions algorithm and make less effective.
346It is possible to turn off cache but the memory buffering in new high
347performance chips can not be disabled.  To address this limitation a new
348algorithm I call Modulo-X was created.  This algorithm is not affected by
349cache or buffering.  The algorithm works as follows:
350  1) For starting offsets of 0 - 20 do
351	1a write every 20th location with a pattern
352	1b write all other locations with the patterns complement
353	   repeat 1b one or more times
354	1c check every 20th location for the pattern
355
356This algorithm accomplishes nearly the same level of adjacency testing
357as moving inversions but is not affected by caching or buffering.  Since
358separate write passes (1a, 1b) and the read pass (1c) are done for all of
359memory we can be assured that all of the buffers and cache have been
360flushed between passes.  The selection of 20 as the stride size was somewhat
361arbitrary.  Larger strides may be more effective but would take longer to
362execute.  The choice of 20 seemed to be a reasonable compromise between
363speed and thoroughness.
364
365
36612) Individual Test Descriptions
367================================
368Memtest86+ executes a series of numbered test sections to check for
369errors.  These test sections consist of a combination of test
370algorithm, data pattern and caching. The execution order for these tests
371were arranged so that errors will be detected as rapidly as possible.
372A description of each of the test sections follows:
373
374Test 0 [Address test, walking ones, no cache]
375  Tests all address bits in all memory banks by using a walking ones
376  address pattern.  Errors from this test are not used to calculate
377  BadRAM patterns.
378
379Test 1 [Address test, own address]
380  Each address is written with its own address and then is checked
381  for consistency.  In theory previous tests should have caught any
382  memory addressing problems.  This test should catch any addressing
383  errors that somehow were not previously detected.
384
385Test 2 [Moving inversions, ones&zeros]
386  This test uses the moving inversions algorithm with patterns of all
387  ones and zeros.  Cache is enabled even though it interferes to some
388  degree with the test algorithm.  With cache enabled this test does not
389  take long and should quickly find all "hard" errors and some more
390  subtle errors.  This section is only a quick check.
391 
392Test 3 [Moving inversions, 8 bit pat]
393  This is the same as test 1 but uses a 8 bit wide pattern of
394  "walking" ones and zeros.  This test will better detect subtle errors
395  in "wide" memory chips.  A total of 20 data patterns are used.
396
397Test 4 [Moving inversions, random pattern]
398  Test 4 uses the same algorithm as test 1 but the data pattern is a
399  random number and it's complement. This test is particularly effective
400  in finding difficult to detect data sensitive errors. A total of 60
401  patterns are used. The random number sequence is different with each pass 
402  so multiple passes increase effectiveness. 
403  
404Test 5 [Block move, 64 moves]
405  This test stresses memory by using block move (movsl) instructions
406  and is based on Robert Redelmeier's burnBX test.  Memory is initialized
407  with shifting patterns that are inverted every 8 bytes.  Then 4MB blocks
408  of memory are moved around using the movsl instruction.  After the moves
409  are completed the data patterns are checked.  Because the data is checked
410  only after the memory moves are completed it is not possible to know
411  where the error occurred.  The addresses reported are only for where the
412  bad pattern was found.  Since the moves are constrained to a 8MB segment
413  of memory the failing address will always be lest than 8MB away from the
414  reported address.  Errors from this test are not used to calculate
415  BadRAM patterns.
416
417Test 6 [Moving inversions, 32 bit pat]
418  This is a variation of the moving inversions algorithm that shifts the data
419  pattern left one bit for each successive address. The starting bit position
420  is shifted left for each pass. To use all possible data patterns 32 passes
421  are required.  This test is quite effective at detecting data sensitive
422  errors but the execution time is long.
423 
424Test 7 [Random number sequence]
425 This test writes a series of random numbers into memory. By resetting the
426 seed for the random number the same sequence of number can be created for
427 a reference. The initial pattern is checked and then complemented and
428 checked again on the next pass. However, unlike the moving inversions test
429 writing and checking can only be done in the forward direction.
430
431Test 8 [Modulo 20, ones&zeros]
432  Using the Modulo-X algorithm should uncover errors that are not
433  detected by moving inversions due to cache and buffering interference
434  with the the algorithm.  All ones and zeros are used for data patterns.
435
436Test 9 [Bit fade test, 90 min, 2 patterns]
437  The bit fade test initializes all of memory with a pattern and then
438  sleeps for 90 minutes. Then memory is examined to see if any memory bits
439  have changed. All ones and all zero patterns are used. This test takes
440  3 hours to complete. The Bit Fade test is not included in the normal test 
441  sequence and must be run manually via the runtime configuration menu.
442
443
44414) Known Problems
445==================
446Sometimes when booting from a floppy disk the following messages scroll up
447on the screen:
448        X:8000
449        AX:0212
450        BX:8600
451        CX:0201
452        DX:0000
453This the BIOS reporting floppy disk read errors.  Either re-write or toss
454the floppy disk.
455
456Memtest86+ has no support for multiple CPUs.  Memtest86+ should run
457without problems, but it will only use one CPU.
458
459Memtest86+ can not diagnose many types of PC failures.  For example a
460faulty CPU that causes Windows to crash will most likely just cause
461Memtest86+ to crash in the same way.
462
463There have been numerous reports of errors in only tests 5 and 8 on Athlon
464systems.  Often the memory works in a different system or the vendor insists
465that it is good.  In these cases the memory is not necessarily bad but is
466not able to operate reliably at Athlon speeds.  Sometimes more conservative
467memory timings on the motherboard will correct these errors.  In other
468cases the only option is to replace the memory with better quality, higher
469speed memory.  Don't buy cheap memory and expect it to work with an Athlon!
470
471Memtest86+ supports all types of memory.  If fact the test has absolutely
472no knowledge of the memory type nor does it need to.  This not a problem
473or bug but is listed here due to the many questions I get about this issue.
474
475Changes in the compiler and loader have caused problems with
476Memtest86+ resulting in both build failures and errors in execution.  A
477binary image (precomp.bin) of the test is included and may be used if
478problems are encountered.
479
480
481

README.build-process

1During memory testing memtest86+ relocates itself in memory so it can test the
2memory it was previously running from.  memtest86+ is compiled as position mostly
3independent code.  Some relocations records must be processed to achieve the
4affect of position independent code.  A 16 bit loader is prepended to memtest86+
5so it can be loaded from a floppy, or from lilo.
6
7In restructuring the build process I had several goals.  Maintainability and
8comprehsibility of the build process.  Simplicity of the toolset. And the
9ability to build images bootable by both the legacy x86 bootloader,
10and images bootable by bootloaders that directly load static ELF images. 
11
12With the ability to proecess relocation records, memtest.bin has been
13reduced in size from 84480 bytes to 49308 bytes.  And now only requires one copy
14of memtest86+.  A reduction in size of 35K.  And the build process can now ignore
15the size of memtest86+.
16
17BIOS calls have been moved from setup.S to head.S making bootsect.S and
18setup.S exclusively for booting.
19
20memtest86+ is built in three stages.  In the first stage the relocatable object
21files are built as with any program.  In the second stage the relocatable object
22files are linked together into memtest_shared, a shared library version
23of memtest86+.  In the third stage a raw memory image of memtest_shared is formed
24and linked into memtest.bin, and memtest.
25
26memtest.bin is the floppy/lilo bootable target.
27
28memtest is the ELF bootable target.
29
30Another major change is now data in the bss segment is also preserved
31when memtest86+ is relocated, and memtest86+ can be relocated to any address. 
32
33The one thing to watch out for is pointers to data inside of memtest86+.  Except
34for constant pointers to static data there is not enough information to generate
35relocation records for pointers so they will not change when memtest86+ is
36relocated, which might lead to nasty surpises.
37
38Eric Biederman <ebiederman@lnxi.com>
39
40