MSST 2015 Speaker
Paul Speciale, Scality
Enabling Petabyte-Scale Active Archives through Software Defined Storage
Unstructured data in the form of traditional files and new object payloads is now being generated in massive volumes from both commercial and scientific applications, with over 40 zettabytes (40 trillion GB) of estimated data content produced by the year 2020.(1)
High-resolution video data alone can generate over 12 TB per hour for raw (4K) video. Applications in geo-seismic, energy research, medical images and consumer online (cloud based) content sharing—provide a basis for the hundreds of exabytes of new unstructured data being generated on an annual basis. Much of this content will require a form of long-term archival data storage solution as the data is reused, mined and monetized over many years. Regulatory compliance requirements for long-term archiving will also drive a considerable amount of data storage.
The key attributes of an effective petabyte scale active archive
solution are centered on massive scalability, high levels of data durability and integrity, multi-protocol online access to data, and transparent migration across platform technology generations without any system or data unavailability. In this session, we investigate how modern scale-out Software Defined Storage (SDS)
solutions can enable seamless capacity and performance scaling to support hundreds of petabytes of data with flexible and durable data protection policies, access via a rich set of fast data access methods including legacy file protocols and new object (REST) based protocols, and a pure hardware-agnostic approach to deployment that gives users freedom to choose the most effective platform at each stage of growth. Pure software-defined solutions enable users to scale the system over multiple hardware generations without the need for manual data migrations or downtime.
(1)—Source: IDC Digital Universe (http://www.kdnuggets.com/2012/12/idc-digital-universe-2020.html