Future trends on hard drive:
1. area density growth
chanlenge: nearghering blocks interference
smr (overlapping sectors): could do sequential read/writes and random reads, can't do random writes
heat assitented magnete recording to increase AD (assymetric temperature for read and write, heated recording, and use laser to heat)
2. what about ssd:
ssd important to improve performance
not a viable candidate for capacity though (cost for fab, but not that much revenue for the whole industry)
Futture trends on NV memory (Fusion io)
muliti-layer vision
multi-layer memory (less reliable) is the vast majority used in data centers
how to effectively use flash:
hiarachy of DRAM, flash, disk
fustion io: api to directly interact with flash instead of traditional block interface
api more appropriate for the flash media (transactional semantics, etc.)
more memory-like semantics of flash instead of traditional stroage view of flash, and corresponding api
basic io: read/write
transaction io: commit
memory like: ability to chase a pointer
challenges:
1. reliability with low cost/high density meida
2. integration with existing software stacks, caches, tiering (falsh consumer orientated, not data center oriented, so up to software guy to make it work for data centers)
3. system and data center implications - networking, scale out vs. scale up
Future trend of data protection/backup (data domain)
phase 0: tape
phase 1: deduplicated disk
difference between backup storage and primary storage (see their fast'12 paper) -- don't care about iops for backup
so disk should be optimized for backup purpose
phase 2: optimized deduplicated disk (disk no longer behave like a tape, and can do things differently than you do using tapes)
new io interfaces to do back-up
incremental forever, virtual full (instead of weekly full backup and daily incremental)
phase 3: integrated data protection (backup) silos
now we endup doing multiple backups every body and each layer (and you don't know how much your organizations are spending on backups!)
phase 4: solve problems of phase 3
provide a data protection data cloud (what is that?)
Why buy innovation (microsoft research, not a storage guy, works on hw acclelaration on search engine)
what innovations on data center?
1. a different scale-cost curve (lower cost for same scale) for the same value -- but innovation has fixed starting up cost (even at 0 scale)
2. different (and greater than linear!) scale-value curve!
e.g., new capability, competitive advantage, new business
university research typically focus on catogary 1 but catogary 2, industry typical want tatogry 2 a lot.
Why do academia trying so hard to do little tweaks on performance but not thinking about 2???
Questions:
1. why add area densities rather than just adding platters or RPM?
limit on how many platters and rpm(energy, say)
2. what is the best mechanisms for the new interfaces?
people like to write programs differently (some people like memory model, while some people do io well)
3. when does these new media hit mainstream?
well, still in early phase.... see where they could go
4. for catagory 2 research, it's hard to measure new values rather than measure performance. How to convince people our research has value?
it's the decision of the community of a whole to reward research of type 2. 30 years ago we have papers with less measurements than today's. so it is a culture problem of the community
5. why didn't you mention arrays? (disk arrays, flash arrarys, memory arrays, etc)
raids are well known
people are building flash arrays, and there are some things new compared to traditional raids
6. innovations on interfaces to storage; for years we have read/write blocks, what has happened to change that? Or are we going to end up with read/write blocks anyway? (by Remzi)
data domain: we got value by changing the interface
fusion io: media changes, roles of open source community has also changed. even though new interfaces not being picked up by old app, but could be used by new apps
microsoft: if we could get a lot performance, then it's worth to change interface
data domain: not just performance, but for new capabilities a lot of times
没有评论:
发表评论