[MUD-Dev] Architecture (Cell Rebalancing)

Thu Jul 3 22:35:51 CEST 2003

From: "J C Lawrence" <claw at kanga.nu> wrote:

> All memory is not equal.  Take a simple example:

>   1) Allocate a block of say 2Gig

>   2) Traverse the block writing one byte to every memory page in
>   the

>   3) Time how long that takes.

>   4) Now allocate a block which is a little less (128 bytes?) than
>   one memory page long (due to random alignment it will likely
>   cross two pages).  The slightly smaller than page size is to
>   accommodate sbrk() overhead which could stretch the block across
>   three pages.

>   5) Write the same number of bytes as you wrote in #2 to this new
>   block, but to (random?) locations in this smaller block.  > 6)
>   Time how long that takes.

> Don't waste your TLB or CPU caches.

but you are not talking about memory here, you are talking about the
effect of cache?  All my memory IS equal (unless it's mapped to some
other device than my main ram) that what makes it unequall is the
caching. is this what you ment? (and the caching changes from
architecture to architecture)

>> Using shared memory, you need to optimize data layout so to
>> minimize cache misses.

> This is tough.  In the SMP case the simple fact of tracking and
> maintaining coherency for a SHM between the CPU caches is a
> non-trivial overhead.  (Intel systems do particularly badly in
> this space)

shm stands for shared memory?

i have intel 7505 chipset, which is until now the best intel can
produce.  the shm issues are not really that much in processor than
in MCH and placing of caches.

My company has, under the years, developed a Telephone Switch. It's
currently the most powerfull single-unit switch on the world, i
believe.  (joke: our marketing departement says:) And still, when i
take a look into the innerst design, i am suprised and
shocked. Until recently, it has been running on some motorola
processors - which each by itself you wouldn't want to have in your
computer.  but there was 12 of them. and the magic was in the
intreconnection design.  there has been local memory, global memory
and various caches and extra busses. (you know the drill).  that all
togather made the results - best.

but back you our discussion - WS. It seems to me that we're both
talking about the same thing in two different languages.  is there
some pointer which can synchronise my thinking?

>> Otherwise, your program, even though it is expressed in a
>> concurrent fashion, could run very very slow.

> Quite.

have you (somebody) tried intel's vtune? i haven't had time. i would
be interested to see results compared for single/HT/more-cpu.

i need to synchronise my vocabulary here - it seems to me that i
have missed some 101 where all the acronyms has been explained :)))

pietro
_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev