2012年11月13日星期二

Future trends on storage

Future trends on hard drive:

1. area density growth
    chanlenge: nearghering blocks interference
                       smr (overlapping sectors): could do sequential read/writes and random reads, can't do random writes
     heat assitented magnete recording to increase AD (assymetric temperature for read and write, heated recording, and use laser to heat)


2. what about ssd:
    ssd important to improve performance
    not a viable candidate for capacity though  (cost for fab, but not that much revenue for the whole industry)


Futture trends on NV memory (Fusion io)
 muliti-layer vision
multi-layer memory (less reliable) is the vast majority used in data centers

 how to effectively use flash:
  hiarachy of DRAM, flash, disk

 fustion io: api to directly interact with flash instead of traditional block interface
             api more appropriate for the flash media (transactional semantics, etc.)
            more memory-like semantics of flash instead of traditional stroage view of flash, and corresponding api

      basic io: read/write
   transaction io: commit
    memory like: ability to chase a pointer

  challenges:
     1. reliability with low cost/high density meida
     2. integration with existing software stacks, caches, tiering (falsh consumer orientated, not data center oriented, so up to software guy to make it work for data centers)
     3. system and data center implications - networking, scale out vs. scale up
   

Future trend of data protection/backup (data domain)
phase 0: tape
phase 1: deduplicated disk
              difference between backup storage and primary storage (see their fast'12 paper) -- don't care about iops for backup
              so disk should be optimized for backup purpose
phase 2: optimized deduplicated disk (disk no longer behave like a tape, and can do things differently than you do using tapes)
              new io interfaces to do back-up
              incremental forever, virtual full (instead of weekly full backup and daily incremental)
 phase 3: integrated data protection (backup) silos
                now we endup doing multiple backups every body and each layer (and you don't know how much your organizations are spending on backups!)
phase 4: solve problems of phase 3
               provide a data protection data cloud (what is that?)
             
Why buy innovation (microsoft research, not a storage guy, works on hw acclelaration on search engine)

what innovations on data center?
1. a different scale-cost curve (lower cost for same scale) for the same value -- but innovation has fixed starting up cost (even at 0 scale)
2. different (and greater than linear!) scale-value curve!
    e.g., new capability, competitive advantage, new business


university research typically focus on catogary 1 but catogary 2, industry typical want tatogry 2 a lot.
Why do academia trying so hard to do little tweaks on performance but not thinking about 2???



Questions:
1. why add area densities rather than just adding platters or RPM?
     limit on how many platters and rpm(energy, say)

2. what is the best mechanisms for the new interfaces?
     people like to write programs differently (some people like memory model, while some people do io well)

3. when does these new media hit mainstream?
     well, still in early phase.... see where they could go
   
4. for catagory 2 research, it's hard to measure new values rather than measure performance. How to convince people our research has value?
    it's the decision of the community of a whole to reward research of type 2. 30 years ago we have papers with less measurements than today's. so it is a culture problem of the community

5. why didn't you mention arrays? (disk arrays, flash arrarys, memory arrays, etc)
    raids are well known
    people are building flash arrays, and there are some things new compared to traditional raids

6. innovations on interfaces to storage; for years we have  read/write blocks, what has happened to change that? Or are we going to end up with read/write blocks anyway? (by Remzi)
    data domain: we got value by changing the interface
     fusion io: media changes, roles of open source community has also changed. even though new interfaces not being picked up by old app, but could be used by new apps
     microsoft: if we could get a lot performance, then it's worth to change interface
    data domain: not just performance, but for new capabilities a lot of times    







没有评论:

发表评论