このページはhttp://www.vgleaks.com/playstation-4-includes-huma-technologyからの引用です
PlayStation 4 includes hUMA technology
There has been a lot of controversy about this matter in the last days, but we will try to clarify thatPlaystation 4supportshUMAtechnologyor at least it implements a first revision of it. We have to remember thatAMDhaven’t released products with hUMA technology yet, so it is difficult to compare with something in the market. Besides, no finished specifications are settled yet, therefore PS4 implementation may differ a bit with finished hUMA implementations.
昨今この話題についてたくさんの論争があるが、我々はPlaystation4がhUMAをサポートしていることについて明確にしてみる。我々はAMDがhUMAを未だhUMAを搭載した製品をリリースしていないことを知っているが、そのため市場にある何かと比較することは難しい。それに加えて、hUMAの最終仕様は固まっていない。そのため、PS4の実装はhUMA仕様とは少し異なるかもしれない。
But first of all,what is hUMA? hUMA is the acronym forHeterogeneous Uniform Memory Access. In the case of hUMA both processors no longer distinguish between the CPU and GPU memory areas. Maybe this picture could explain the concept in a easy way:
しかし、まず最初にhUMAとはなんだろうか?hUMAとはHeterogeneous Uniform Memory Accessの頭文字をとったものだ。hUMAではCPUとGPUのメモリーエリアをもはや区別しない。次の図が簡潔にこのコンセプトを説明しているだろう。
If you want to learn more about this tech, thisarticleexplains how hUMA works.
もしこの技術についてもっと知りたいならば、この記事がhUMAがいかに動作するか説明している。
PS4 hasenhancementsin the memory architecture that no other “retail” product has, asMark Cernypointed in different interviews. We will try to show the new parts in PS4 components in the next pages.
マーク・サーニーが様々なインタビューで説明した通り、PS4はメモリーアーキテクチャに手を加えられており、これは他の市販の製品にはないものだ。PS4のこの新しいパーツについて次のページで説明してみよう。
We need to put our diagram about PS4 memory architecture to explain how it works.
次のPS4メモリーアーキテクチャの図がこれがどのように動くかを説明している。
Mapping of memory in Liverpool– Adresses are 40 bit. This size allows pages of memory mapped on both CPU and GPU to have the same virtual address– Pages of memory are freely set up by theapplication– Pages of memory do not need to be both mapped on CPU and GPUIf only the CPU will use, the GPU does not need to have it mappedIf only the GPU will use, it will access via Garlic – If both the CPU and GPU will access the memory page, a determination needs to be made whether the GPU should access it via Onion or GarlicIf the GPU needs very high bandwidth , the page should be accessed via Garlic; the CPU will need to access it as uncached memoryIf the CPU needs frequent access to the page, it should be mapped as cached memory on the CPU; the GPU will need access it via Onion.
Mapping of memory in Liverpool
– Adresses are 40 bit. This size allows pages of memory mapped on both CPU and GPU to have the same virtual address
– Pages of memory are freely set up by theapplication
– Pages of memory do not need to be both mapped on CPU and GPU
– If both the CPU and GPU will access the memory page, a determination needs to be made whether the GPU should access it via Onion or Garlic
Liverpoolのメモリーマッピング– アドレスサイズは40bitで、CPUとGPUのメモリーマップページに同じ仮想アドレスを持たせることができる
– メモリーページはアプリケーションによって自由に設定される
– メモリーページは必ずしもCPUとGPUの両方にマップされる必要はない
– もしCPUとGPUがメモリーページにアクセスするのであれば、GPUがOnionかGarlicのどちらを使うかを決めなければならない
Five Type of Buffers
– System memory buffers that the GPU uses are tagged as one of five memory types
– These first three types have very limited CPU access; primary access is by the GPU
– Read Only (RO)
– Private (PV)
– GPU coherent (GC)
– The last two types are accessible by both CPU and GPU
– System coherent (SC)
– Uncached (UC)
– The first three types (RO, PV, GC) may also be accessed by the CPU, but care must be taken. For example, when copying a texture to a new location
Tracking of Type in Memory Accesses
– Memory accesses are made via V# and T# definitions that contain the base address and other parameters of the buffer or texture
– Three bits have been added to V# and T# to specify the memory type
– And extra bit has been added to the L1 tags
– An extra bit has been added to the L2 tags
Simple Example:
– Let’s take the case where most of the GPU is being used for graphics (vertex shaders, pixel shaders and so on)
– Additionally, let’s say that we have an asynchronous compute dispatch that uses a buffer SC memory for:
– The GPU can:
1) Acquire the SC buffer by performing an L1 invalidate (GC and SC) and an L2 invalidate (SC lines only). This eliminates the possibility of stale data in the caches. Any SC address encountered will properly go offchip (to either system memory or CPU caches) to fetch the data.
2) Run the compute shader
3) Release the SC buffer by performing an L2 writeback (SC lines only). This writes all dirty bytes back to system memory where the CPU can see them
– The graphics processing is much less impacted by this strategy
This technical information can be a bit overwhelming and confuse, thereforewe will disclose more information and examples of use of this architecture in a new article this week.
このサイトはreCAPTCHAによって保護されており、Googleの プライバシーポリシー と 利用規約 が適用されます。
1文字以上入力してください
本文は少なくとも1文字以上必要です。
1文字以上入力してください。
下から選んでください: