Craftsmanship and Artist Techniques

映像制作分野から見たFPGAに対する関心

Interest in the FPGA considered from film production

（株式会社ポリゴン・ピクチュアズ / スタジオフォンズ）

(Polygon Pictures Inc. / Studio Phones)

今後の会合の状況を見つつ加筆修正を含めブラッシュアップを進めていく予定です。

We plan to add more content and make further revisions based on the situation of future seminars.

translated by PPI Translation Team

■概要

■Overview

FPGAなどSoC(System on a Chip)が身近になりつつあり、映像制作でもこれらの技術を念頭にどのようにパラダイムシフトすべきか模索しているスタジオもそろそろ増えていく時期ではないかと思われます。
実際、FPGA搭載Xeon Scalable Processor搭載のストレージなどのニュースや、機械学習などでの高負荷計算への進展に関しても様々な形で日々目にすることもあるかと思いますが、今回の会合ではラウンドテーブルでこの話題を取り上げるため、この資料では多少現在考えられる映像制作インフラ面でのFPGA関心に関して軽く触れていきたいと思います。
モデレーター達もディスカッションへ向けての準備として下記の資料を記載させて頂いております。十二分に調べたとは言えませんので、どうかその前提でお読み下さい。

As SoC (System on a Chip) such as FPGA is becoming more accessible, one can assume that even in the field of video production we’ve entered an era where very soon more and more studios will be searching for ways to shift the paradigm with these technologies in mind.
It’s probable you’ve already been coming across news on a daily basis through various means regarding storage with a Xeon Scalable Processor equipped with FPGA or advances made in the field of high-capacity calculations via machine learning. As we will be addressing this topic through round table discussions during the seminar, in this document we’d like to lightly touch base with the topic of interest in FPGA from the standpoint of the current video production infrastructure.
We’ve listed reference below so that the moderators can also prepare for the discussion. However, please keep in mind that the following information is not the result of any extensive research.

■FPGAそのものへの関心

■Interest in FPGA

CPU主体の計算クラスターを大規模に抱える多くの映像制作スタジオの場合、その電力の獲得、維持、それに伴う各種メンテナンス、躯体ベースでのエアフロー確保とサーバルーム全体のエアフロー確保、発火などの非常時対策とこれらに伴う24時間365日の管理体制構築と維持に、神経をすり減らし運営していることでしょう。そこに加え、熱管理をCPU以上に慎重に扱うGPUクラスターなども近年は割合いを増し始め、デリケートな計算を扱う場合はGPUの癖も理解し、その上で電力効率と計算効率、バッチ処理時の各スケジューリング問題などを的確に扱い、その上で可能な限り無駄を減らし、エコロジカルな意識とコスト削減の精神でもって日常的に運用していることと思います。
その課題に関して、近年は大規模計算へのFPGA利用も増え始め、サーバー内での日常的に扱う計算をFPGAで行うなど、比較的大手企業からFPGAに関する積極利用が始まっていると言えます。加えて、Altera/IntelのFPGA搭載Xeon Scalable Processorに関するニュースやそれを実際に搭載したストレージなどの登場により、キャッシュミスの少なさと電力効率の良さ、Intel.HSLcompilerによるmathライブラリを含む様々なサポートの進展など、FPGAそのものへの関心は性能と効率、コスト削減を考えていく上で大きな課題になっているとモデレータ達は考えています。

It’s most likely the case that many video production studios that host a large scale of calculation clusters with a CPU core are wearing their nerves out just to operate them, given that they have to secure electric power, maintain it, give appropriate corresponding maintenance, secure airflow per skeleton base, secure airflow for the entire server room, take countermeasures for emergencies such as fires, and create and maintain an accompanying management system that runs 24 hours a day, 365 days a week. In addition, in recent years the percentage of GPU clusters (for which heat management has to be handled even more cautiously than with CPUs) is starting to see an increase. Therefore, we imagine that when such studios are handling intricate calculations, they are doing so with a comprehension of the peculiarities of GPU, and bearing this in mind, precisely handling the electrical efficiency, calculation efficiency, and individual scheduling issues that arise during batch processing, and on top of that reducing wasteful operations as much as possible, carrying out management with an ecological consciousness and spirit of cost reduction on a daily basis.
In regards to this challenge, in recent years the use of FPGA for large scale calculations has begun increasing, and it can be said that relatively major firms are starting to proactively make use of FPGA, such as by doing calculations handled daily within the server through FPGA instead. Additionally, due to news concerning Xeon Scalable Processors loaded with Altera/Intel FPGA as well as the emergence of storage actually equipped with it, the low occurrence of cache misses, the high level of electrical efficiency, and the progress made in various supporting functions including a math library via Intel HSL compiler, the moderators believe that interest in FPGA itself has become a major challenge that remains to be tackled when devising ways to improve performance, efficiency, and reduce the cost.

■ヘテロジニアスコンピューティングへの対応

■Response to Heterogeneous Computing

ムーアの法則の曲がり角として語られることの多いヘテロジニアスコンピューティングへの対応というフレーズは、次第に映像制作でも真剣に考えていかなければならない問題の多い事柄になって来ていると思われます。
例えば、計算クラスター内で機械学習などで学習後のモデルを推論モデルとして常時稼働などの事柄を考えていく場合、省エネ性などを考慮してFPGAも視野に入って来ている現状があり、PC/Severモデルからクライアント/クラウド（パブリック及びプライベートクラウド）での変化と同様に、今後益々これらの変化への対応を考えていくことは多くの分野と同様に映像制作でも自然な展開になっていくと思われます。
ドメイン固有アーキテクチャーなど今までと違った切り口で映像制作パイプラインで扱う各種計算をどう考えて行くかなど、今回の会合では多少でもディスカッションを行って行きたいと思います。

It can be said that the phrase “handling heterogeneous computing”, often discussed as the turning point of Moore’s law, is becoming a highly problematic matter that would eventually need to be considered seriously even in the field of video production.
For example, when thinking about matters such as running a model full-time that has already learned (through machine learning and so forth) within the calculation cluster as an inference model, given the current situation that FPGA is also becoming an option when taking energy saving performance and such into consideration, in the same way that changes are occurring from the PC/Server model to client/cloud model (public and private cloud), thinking of ways to respond to these changes will most likely become a natural development increasingly more within the field of video production from now on, in the same way that it has become for many other fields as well.
At this seminar, we’d like to have a discussion, even if just a little, regarding how to think about various calculations, domain-specific architecture, and other matters handled in the video production pipeline with an approach entirely different from the past.

■高負荷計算や自社サーバ内での自社サービスへの利用と集積回路

■Heavy-load calculation and usage for in-house service within its own server and integrated circuit

省エネかつ高負荷計算を行うインフラ基盤を構築するにあたって、ASICでTPUなどの開発するなどのアプローチもGoogle社の事例に限らず出てきていると思います。しかし一方、ASICはその生産上、製造コストが映像制作スタジオでなかなか出せる範囲でもなく、設計さえ済めば平易に回路を書き換え可能なFPGAで同様の展開を進める企業も多いのも現状かと思います。
クラウドサービスでのFPGA対応も近年では非常に進んで来ており、機械学習や深層学習で得られたモデルを活用する際のサーバー利用など、今後は映像やゲームの分野でも事例が増えても不思議はない状況かと思われますし、実際、ONNX(Open Neural Network Exchange)のサイトにUnityのロゴも見えるように、映像やゲームに限らない様々な利用でのゲームエンジン利用も含め、今後は変化が出て行くのではないか、映像制作の上では、画像処理などのオンプレミスでのサーバーサイドサービスや、スケールアウト系ストレージ内での常時巡回型のサービス、計算クラスタ内で動作する内製的機能のアクセラレーターなど、開発コストが見合うなら導入も視野に入るのでは？ともモデレーター達は考えております。
年々、FPGAの開発自体は開発しやすさを増している傾向にあり、この会合では、将来のFPGA利用を想定し、どのような用い方が可能かのディスカッションを進めたいと思います。

There are most likely companies other than Google that are taking the approach of developing TPU and such with ASIC in relation to creating an infrastructure base that carries out high-load calculations while still being energy efficient. On the other hand, due to the way in which ASICs are manufactured, its cost of production prevents it from being easily accessible to video production studios. However, it is also most likely the case in modern times that there are many companies that are moving forward with the same types of developments using FPGAs whose circuits can easily be rewritten, as long as the design has been established.
In recent years, the handling of FPGA through cloud services has been seeing significant advancements. Therefore, it wouldn’t be far-fetched to assume that we will soon be seeing more examples of use in the field of videos and games, such as using servers for utilizing models obtained through machine learning or deep learning. In addition, as exemplified by the fact that the Unity logo can be seen on the ONNX (Open Neural Network Exchange) website, the moderators believe it is possible there will be further changes from now on, including using game engines not only limited to videos and games.
Furthermore, the moderators also believe that in regards to video production, it may become plausible to consider implementing on-premise server side services such as image processing, 24-7 round support service within the scale-out type storage, or accelerators which operate within calculation clusters equipped with an internal function if the development costs can be counterbalanced.
Year by year FPGA development itself has been following a trend of becoming increasingly less complicated to do. At this seminar, we’d like to discuss possible methods of use based on the assumption that FPGA will be used in the future.

■まとめ

■Conclusion

FPGAの映像制作インフラへの採用はまだまだ検討段階である印象はありますが、近年のIntel、HSLcompilerなどの進展は目覚ましく、会合までにもまだまだ状況は変わっていそうな気がします。前述のような様々な利点（もちろんまだデメリットも多いとは考えている）を念頭に、会合内のラウンドテーブルではその運用面の可能性に関して、ざっくばらんにディスカッションしたいと思います。

[参考資料]FPGAに関心を持つ背景
・RISC-V, 50years
https://riscv.org/wp-content/uploads/2017/05/Mon0915-RISC-V-50-Years-Computer-Arch.pdf

[参考資料]SoCパラダイムとこれからの制作環境

SoC(System on a Chip)が身近になりつつあり、FPGA搭載Xeon Scalable Processor搭載のストレージなどのニュースや、機械学習などでの高負荷計算への進展に関しても様々な形で日々目にすることもあるかと思います。
C、C++、OpenCLなど、映像制作やゲーム開発で慣れ親しんだ環境との親和性が高まっていくなか、SoC FPGA製品向けLinux関係の情報も多く、ここから数年でSDSといったストレージだけでなく様々な演算への適切な適用に関して、わたしたちも模索の日々を過ごすこととなると思われます。
現状まだまだ模索の段階ではありますが、今回の会合では、多少のモデレーター側で調べた情報の解説と、今後考えて行きたいユースケースの提示などを行なっていく予定です。

[題材]FPGAに置き換えて行きたい処理など（検討用）
・サーバーサイドでの高負荷計算全般
　スケールするものとしないものを分けてディスカッションしていく
・Denoiserなどのサーバーサイドでの処理
　RenderingやSimulation,Composite時などの各種部分処理
・リアルタイム処理のランタイム向け(ONNXなどを用いた各種計算の可能性)
・L4,L7LBなど負荷分散関係

インフラ設計を行う際には、映像制作での制作コストを下げていく意味での省エネ化や、映像制作パイプラインへのより良い統合、プリプロ段階でのアーティスト支援系サービスの省エネ化と高速処理の両立などが課題になるとモデレーター達は考えておりますが、現在はまだFPGAでの開発コストも多少高い傾向があり、急激に進むインフラ面での変革を題材に、今後の可能性と難度が生じる事柄に関して考えて行きたいと考えております。
特には、機械学習、深層学習などの利用模索を進めているスタジオでのユースケースを意識し、省エネ化と運用コスト低減などを念頭に、じっくりディスカッションを進めたいと思います。

[資料]FPGAの基本的な資料など（Intel® 関係）

・Intel® FPGA SDK for OpenCL™
https://www.altera.co.jp/products/design-software/embedded-software-developers/opencl/overview.html

・Intel® SoC FPGA
https://www.altera.co.jp/products/design-software/embedded-software-developers/overview.html

[資料]FPGAに関する参考になりそうな資料など

・rocketboards
https://rocketboards.org

・FPGA, Web Server Design Example
https://www.altera.com/support/support-resources/design-examples/intellectual-property/embedded/nios-ii/exm-micro_tutorial.html

・4K Video Upscaling Format Conversion
https://www.altera.com/products/reference-designs/all-reference-designs/broadcast/ref-4k-video-upscaling.html

・AWS EC2 Fi Instance
https://aws.amazon.com/jp/ec2/instance-types/f1/

[資料]FPGAに関する各種ツールの資料など

・Intel, HSL compiler
https://www.altera.co.jp/products/design-software/high-level-design/intel-hls-compiler/overview.html

・OPAE(Open Programmable Acceleration Engine)
https://www.intel.co.jp/content/www/jp/ja/programmable/solutions/acceleration-hub/acceleration-stack.html<

・OPAE Intel FPGA Linux Device Driver Architecture
https://www.intel.com/content/www/us/en/programmable/documentation/swn1503506366945.html

・Open Programmable Acceleration Engine
https://opae.github.io/latest/index.html

・M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB. Functionality, Performance and TCO
https://www.slideshare.net/MariaDB/m18-intel-and-mariadb

[資料]計算クラスター向けに題材にしたい資料

・The Parallel Universe
https://jp.xlsoft.com/documents/intel/magazine/Intel_ParallelUniverse_Issue31_JPN.pdf

・OpenCL* on Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA Quick Start User Guide
https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-qs-ias-opencl-a10-v1-1.pdf

・OpenCL Vector Addition Design Example
https://www.intel.co.jp/content/www/jp/ja/programmable/support/support-resources/design-examples/design-software/opencl/vector-addition.html

・nemu Modern Hypervisor for the Cloud
https://github.com/intel/nemu

[資料]機械学習、深層学習及びそれらのruntimeとFPGAのディスカッション向けの資料

・ONNX
https://onnx.ai/

・ONNX Runtime
https://github.com/microsoft/onnxruntime

・ONNX Supported tools
https://onnx.ai/supported-tools

・NNEF
https://www.khronos.org/nnef

・ONNX Model Zoo
https://github.com/onnx/models

・ONNX JS
https://github.com/Microsoft/onnxjs

・ONNX JS demo
https://github.com/Microsoft/onnxjs-demo

・Tensorflow JS
https://js.tensorflow.org/

・Azure FPGA
https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-fpga-web-service

[資料]機械学習と深層学習のcompiler関係（FPGA対応とは限らない）

・glow
https://github.com/pytorch/glow

・tvm
https://tvm.ai/

[資料]MIPS,RISC-V関係、FPGAと絡めてディスカッションするための資料

・MIPS Open™
https://www.mips.com/mipsopen/

・MIPS fpga
https://www.mips.com/downloads/mipsfpga-getting-started-guide-2-0/

・MIPS fpga github
https://github.com/MIPSfpga/

・RISC-V Fondation
https://riscv.org/

・RISC-V Foundation to leverage the Linux Foundation's tools, infrastructure, services and training programs
https://www.linuxfoundation.org/the-linux-foundation/2018/11/the-linux-foundation-and-risc-v-foundation-announce-joint-collaboration-to-enable-a-new-era-of-open-architecture/

[資料]Intel(R) Graphics Compiler for OpenCL(TM)とllvm関係の資料など

・Intel(R) Graphics Compiler for OpenCL(TM)
https://github.com/intel/intel-graphics-compiler

・LLVM
https://llvm.org/

・LLVM Clang
http://clang.llvm.org/

While there is an impression that the adoption of FPGA into the video production infrastructure is still in it’s early investigation stage, the developments made in recent years for Intel, HSL compilers and such have been ground-breaking, and we do feel that the situation may shift even further by the time of the seminar. With the various aforementioned benefits in mind among others (although of course we are aware that there are still many demerits), we hope to have candid and straightforward round-table discussions at the seminar regarding the possibilities of the operation side of FPGA.

[Reference Material] The Reason Behind the Interest in FPGA
https://riscv.org/wp-content/uploads/2017/05/Mon0915-RISC-V-50-Years-Computer-Arch.pdf

[Reference Material] SoC Paradigm and Production Environments in the Future

Given that the SoC (System on a Chip) is becoming more accessible, most likely you’ve already been coming across news on a daily basis through various means regarding storage with a Xeon Scalable Processor equipped with FPGA or advances made in the field of high-capacity calculations via machine learning.
As the affinity for movie production and game development environments we’ve grown familiar with such as C, C++, and OpenCL increases, a lot of information related to Linux for SoC FPGA products is becoming available as well. It is likely that in the next few years we’ll be spending our days searching for adequate application methods to various operations that is not limited only to storage, such as SDS.
Although we are currently still in the stages of feeling our way around for solutions, at this seminar we plan on explaining the small amount of information that was researched by the moderators, as well as presenting use cases we’d like to consider in the future.

[Topic] Treatments and Such to be Ideally Replaced with FPGA (Topics for Review)
・All high-load calculations taking place on the server side
　Have discussions separately for those that will be scaled and those that won’t be.
・Processing taking place on the server side, such as denoiser
　Various individual treatments during rendering, simulation, composite, etc.
・Real-time processing for runtime (possibilities of various treatments using formats such as ONNX)
・Load distribution relations such as L4 or L7LB

While the moderators believe that when designing the infrastructure, the challenges will consist of shifting towards energy conservation for the purpose of lowering production costs during video production, improving integration into the video production pipeline even further, finding ways to conserve energy in the preproduction phase for artist supporting services and so forth and balancing it out with high-speed processing and such, currently the development costs with FPGA are still gearing towards the higher end of the spectrum. Therefore, with the innovations in infrastructure taking place at a drastic pace set as the discussion topic, we hope to give further thought to the future possibilities as well as the matters that would prove difficult to varying degrees.
We’d especially like to be mindful of the use cases in studios that are continuing to search for ways to utilize machine learning and deep learning, and have complete and thorough discussions focusing on topics such as energy conservation and reduction of operational costs.

[Sources] General FPGA Materials and Such (Intel® Related)

・Intel® FPGA SDK for OpenCL™
https://www.altera.co.jp/products/design-software/embedded-software-developers/opencl/overview.html

・Intel® SoC FPGA
https://www.altera.co.jp/products/design-software/embedded-software-developers/overview.html

[Sources] Materials and Such Regarding FPGA that May be Useful

・rocketboards
https://rocketboards.org

・FPGA, Web Server Design Example
https://www.altera.com/support/support-resources/design-examples/intellectual-property/embedded/nios-ii/exm-micro_tutorial.html

・4K Video Upscaling Format Conversion
https://www.altera.com/products/reference-designs/all-reference-designs/broadcast/ref-4k-video-upscaling.html

・AWS EC2 Fi Instance
https://aws.amazon.com/jp/ec2/instance-types/f1/

[Sources] Materials and Such Regarding Individual Tools Related to FPGA
・Intel, HSL compiler
https://www.altera.co.jp/products/design-software/high-level-design/intel-hls-compiler/overview.html

・OPAE(Open Programmable Acceleration Engine)
https://www.intel.co.jp/content/www/jp/ja/programmable/solutions/acceleration-hub/acceleration-stack.html<

・OPAE Intel FPGA Linux Device Driver Architecture
https://www.intel.com/content/www/us/en/programmable/documentation/swn1503506366945.html

・Open Programmable Acceleration Engine
https://opae.github.io/latest/index.html

・M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB. Functionality, Performance and TCO
https://www.slideshare.net/MariaDB/m18-intel-and-mariadb

[Sources] Materials to Ideally Focus On for Calculation Clusters

・The Parallel Universe
https://jp.xlsoft.com/documents/intel/magazine/Intel_ParallelUniverse_Issue31_JPN.pdf

・OpenCL* on Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA Quick Start User Guide
https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-qs-ias-opencl-a10-v1-1.pdf

・OpenCL Vector Addition Design Example
https://www.intel.co.jp/content/www/jp/ja/programmable/support/support-resources/design-examples/design-software/opencl/vector-addition.html

・nemu Modern Hypervisor for the Cloud
https://github.com/intel/nemu

[Sources] Materials for Machine Learning, Deep Learning, Each of Their Runtimes and for FPGA Discussions

・ONNX
https://onnx.ai/

・ONNX Runtime
https://github.com/microsoft/onnxruntime

・ONNX Supported tools
https://onnx.ai/supported-tools

・NNEF
https://www.khronos.org/nnef

・ONNX Model Zoo
https://github.com/onnx/models

・ONNX JS
https://github.com/Microsoft/onnxjs

・ONNX JS demo
https://github.com/Microsoft/onnxjs-demo

・Tensorflow JS
https://js.tensorflow.org/

・Azure FPGA
https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-fpga-web-service

[Sources] Regarding Compilers for Machine Learning and Deep Learning (Not Limited to FPGA Handling)

・glow
https://github.com/pytorch/glow

・tvm
https://tvm.ai/

[Sources] Materials Related to MIPS, RISC-V and for Discussions Entertwined with FPG

・MIPS Open™
https://www.mips.com/mipsopen/

・MIPS fpga
https://www.mips.com/downloads/mipsfpga-getting-started-guide-2-0/

・MIPS fpga github
https://github.com/MIPSfpga/

・RISC-V Fondation
https://riscv.org/

・RISC-V Foundation to leverage the Linux Foundation's tools, infrastructure, services and training programs
https://www.linuxfoundation.org/the-linux-foundation/2018/11/the-linux-foundation-and-risc-v-foundation-announce-joint-collaboration-to-enable-a-new-era-of-open-architecture/

[Sources] Intel® Graphics Compiler for OpenCL™ and Materials and Such Related to LVM

・Intel(R) Graphics Compiler for OpenCL(TM)
https://github.com/intel/intel-graphics-compiler

・LLVM
https://llvm.org/

・LLVM Clang
http://clang.llvm.org/