by Brendan Saltaformaggio@Purdue
memory forensics: does not require a suspect's password to unlock the device, oblivious to any persistent storage encryption
Evidence is memory is stored data structure
previous state of the art: evidence is recovered from plain-text or self-evident fields
however, cannot understand the content of the data structure
Approach: reuse the functions that print/render the data
intuit: invalid data content breaks the function, versus valid data generate output
how to find rendering logic: dynamic analysis on binary
how to isolate entry point: test every "candidate" entry point
how to setup proper context: run with some dummy input until the entry point
What about mobile environment:
Problem: too many apps to identify just a few rendering logic
use andriod gui frameworks "draw_ops" etc. data structure
what about background applications, where some of the gui tree nodes are nullfied?
1. try reconstruct the tree sturcture
2. to find the graphic content in each node: piecing together the screen by moedling is as a matching problem
How to reconstruct previous screens (not just the current one)?
Limitation of the previous approach: only recovers the latest screen
How to approach: profile to see how app's internal memory and screen-drawing memory size change over time (when I change screen)
Solution: utilize Android's redraw mechanisms to reuse app's internal memory
generically interleave the execution of a live Android environment and the memory image
Q: how dependent your techniques are on specific version of Android?
A: we updated from Android 2.2 to 6.0, the essence does not change
Vision: cyber forensics need shift from personal experiences to more formal methods
Yang Suli的Blog
2017年2月6日星期一
2017年2月3日星期五
IOweYou Credit Network
by Aniket Kate @Purdue
centralized (Amazon, Uber, etc.) --> decentralized business model
crypto-currencies may or may not survive, but the concept of distributed ledger/blockchain remains
protocol: application level, middleware/service level, infrastructure/base level
thing-thing trade: problems rise from lack of communication medium
stone money: oral history, no physical movement
Questions:
How well do we understand their consensus process?
Proof of Work vs Proof of Stake
Bitcoin network has scalability problem because of all the communication required.
Credit Networks solves this problem.
Essense of network: confidence on your friends
Problems of credit network:
Path selection (how do we find and select paths)
Liquidity of the network (restrict to certain nodes and paths, what's the probability of transition success?)
Game prevention --> loss due to misbehaving identities is bounded and (sometimes) localized ---> assumes introducing nodes is much easier than drawing trust from well-behaved nodes
Examples:
1. Bazaar (NSDI'11) --> seems to look on simulation of eBay data
2. Ripple Credit Network (realized)
allows for currency exchange (node performs exchange, you need to find a path with such nodes)
Comparison from Bitcoin network:
transfer: bitcoin directly from two wallets, credit network via a path with enough credit
liquidity: good vs. restricted by path availability
scalability: imited (<100 bps="" high="" nbsp="" p="" scalability="" vs.="">
Can augment the credit network with social trust
Privacy might be a problem in Ripple: if I can link one transaction to you, I can find all your transactions.
How to define privacy?
transaction value privacy and transaction receiver transaction
100>
centralized (Amazon, Uber, etc.) --> decentralized business model
crypto-currencies may or may not survive, but the concept of distributed ledger/blockchain remains
protocol: application level, middleware/service level, infrastructure/base level
thing-thing trade: problems rise from lack of communication medium
stone money: oral history, no physical movement
Questions:
How well do we understand their consensus process?
Proof of Work vs Proof of Stake
Bitcoin network has scalability problem because of all the communication required.
Credit Networks solves this problem.
Essense of network: confidence on your friends
Problems of credit network:
Path selection (how do we find and select paths)
Liquidity of the network (restrict to certain nodes and paths, what's the probability of transition success?)
Game prevention --> loss due to misbehaving identities is bounded and (sometimes) localized ---> assumes introducing nodes is much easier than drawing trust from well-behaved nodes
Examples:
1. Bazaar (NSDI'11) --> seems to look on simulation of eBay data
2. Ripple Credit Network (realized)
allows for currency exchange (node performs exchange, you need to find a path with such nodes)
Comparison from Bitcoin network:
transfer: bitcoin directly from two wallets, credit network via a path with enough credit
liquidity: good vs. restricted by path availability
scalability: imited (<100 bps="" high="" nbsp="" p="" scalability="" vs.="">
Can augment the credit network with social trust
Privacy might be a problem in Ripple: if I can link one transaction to you, I can find all your transactions.
How to define privacy?
transaction value privacy and transaction receiver transaction
100>
2016年5月5日星期四
MSST16 Session 4: Spotlight on Flash memory and Solid-State Drives
Adaptive policies for balancing performance and lifetime of mixed SSD arrays through workload sampling
high-end SSD: cache
Low-end SSD: main storage
1 high-end SSD cache for 3 low-end SSD: high-end SSD life is 1.47 years versus low-end 6.34 years assuming LRU cache policy
problem: high-end SSD cache can wear out faster than low-end SSDs main storage
approach: balance the performance and lifetime at the same time
metric: optimize latency over lifetime (less is better)
selective caching policies ---> decide cache policy based on request size and hotness
REAL: A Retention Error Aware LDPC Decoding Scheme to Improve NAND Flash Read Performance
error correction codes: BCH, LDPC
Analytic models for flash-based SSD performance when subject to trimming
SSD structures: N blocks, b pages per block, unit of data exchange is a page, page has 3 possible states: eras, valid or invalid
data can only be wrtten on pages in erase state
erase can be performed on whole block only
assume j valid pages on a victim block with probability pj,
write amplification A equals
A = b/(b - sum(j*pj))
prior work: mostly assumes uniform random writes and Rosenblum(hot/cold) workloads
exact (closed form) results when N -> infinity
1. greedy is optimzed under random writes, d-choices close to optimal (for d as small as 10)
2. increaseing hotness worsens WA in case of single WF (as no hot/cold data sepeartion takes place)
3. Double WF (seperates writes triggered by hot and GC): WA decreases with hotness (as partial hot/cold data separation takes place)
However, they all assume no trimming
How do we model trim behavior?
Main takeaway:
trimming results in effective load (utilization)
Reducing Write Amplification of Flash Storage through Cooperative Data Management with NVM
write amplification and GC causes SSD performance fluctuation
in traditional systems, all live pages need to be copied to another block whiling erasing
however, CDM skips coping
"removable" state: can be erased if the data needed to be copied into
issue 1: consistency ---> file system needs to be modified
issue 2: communication overhead -> events in cache and storage should be notified to each other synchronously --> use NVM-e to piggyback
NV-cache as in-storage cache
evaluation:
CDM reduces write-amplification by 20x, improves response time as well
Exploiting Latency Variation for Access Conflict Reduction of NAND Flash Memory
motivation:
ECC complexity, ECC capability and read speed tradeoffs: high sensing level means preciser memory and higher ECC capability
program size and write speed tradeoff:
process variable and retention variation leed to speed variation
hotness-aware write scheduling:
retention aware read scheduling
write: size-based predicted hotness
read:
evaluation:
MSST'16 session 3 Store More, Longer, and for Less: Deduplication and Archival Systems
A Long Term User-Centric Analysis of Deduplication Patterns
study a dataset of 21 months, 1 snapshot per user per day
tracer.filesystems.org
a lot of small files (< 1M), but a few large files consume most of the space
in general, small files achieve higher deduplication ratio than large files
per-user deduplication ratio, redundancy (across users), differs a lot
Lazy Exact Deduplication
postpone disk lookups (fingerprints lookup) until we can do them in a batch
Sorted Deduplication: How to Process Thousands of Backup Streams
requirement is changing: a few large streams ---> many streams (e.g., cloud backup)
Effects of Prolonged Media Usage and Long-term Planning on Archival Systems
preserving data for ~100 to ~1000 years
question:
when do you retire/replace media?
how long do you plan for?
Failure scenarios: device failures and economic failure
1. should media be used past their manufacture suggested service life or warranty period?
(for archival data disk might last longer)
have a model to model the purchase, maintaining and retiring phase to calculate cost
MSST'16: It's never too fast: storage performance enhancements
Pfimbi: Accelerating Big Data Jobs Through Flow-Controlled Data Replication
in HDFS, synchronous replication (in pipeline) has performance bottlenecks, and seldomly helps application performance
- only 2% of data was read within 5mins of being written
So do asynchronous replication
Need to use flow control to manage congestion as well
ManyLogs: Improved CMR/SMR Disk Bandwidth and Faster Durability with Scattered Logs
in HDFS, synchronous replication (in pipeline) has performance bottlenecks, and seldomly helps application performance
- only 2% of data was read within 5mins of being written
So do asynchronous replication
Need to use flow control to manage congestion as well
ManyLogs: Improved CMR/SMR Disk Bandwidth and Faster Durability with Scattered Logs
problem: small durable writes severely impacts bandwidth of other users (e.g., sequential reader)
in this case, data journaling outperforms ordered journaling!
Ordered journaling: efficient for large writes
data journaling:efficient for small writes (less seeks)
previous work: adaptive journaling (ATC'05)
many logs, small writes to the nearest log (to the current head)
where to put logs on the disk? reserve 10MB for every platter (?)
checkpointing: lazy instead of every 5 seconds for many-logs
2016年1月31日星期日
Cluster setup in hbase (zz)
Cluster setup in hbase
Before Starting hbase cluster
To configure HBase, we need to have a running Hadoop cluster, which will be the storage for hbase(Hbase store data in HDFS). Please refere to Installing and configuring hadoop cluster .And plese make sure that user name of all machines and the path where the hbase is installed are same in all machines.In my case user is hduser.
These are the steps ,how to setup and run Hbase cluster.We have build hbase cluster using three Ubuntu machine.A distributed HBase depends on a running ZooKeeper cluster.we are using default ZooKeeper cluster, which is manage by Hbase.
There are basically three type of node.
1. Hbase Master:- The HbaseMaster is responsible for assigning regions to HbaseRegionserver, monitors the health of each HbaseRegionserver.
2. Zookeeper: – For any distributed application, ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
3. Hbase Regionserver:- The HbaseRegionserver is responsible for handling client read and write requests. It communicates with the Hbasemaster to get a list of regions to serve and to tell the master that it is alive.
In our Example, one machine in the cluster is designated as Hbase master and Zookeeper. The rest of machine in the cluster act as a Regionserver.
To configure HBase, we need to have a running Hadoop cluster, which will be the storage for hbase(Hbase store data in HDFS). Please refere to Installing and configuring hadoop cluster .And plese make sure that user name of all machines and the path where the hbase is installed are same in all machines.In my case user is hduser.
These are the steps ,how to setup and run Hbase cluster.We have build hbase cluster using three Ubuntu machine.A distributed HBase depends on a running ZooKeeper cluster.we are using default ZooKeeper cluster, which is manage by Hbase.
There are basically three type of node.
1. Hbase Master:- The HbaseMaster is responsible for assigning regions to HbaseRegionserver, monitors the health of each HbaseRegionserver.
2. Zookeeper: – For any distributed application, ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
3. Hbase Regionserver:- The HbaseRegionserver is responsible for handling client read and write requests. It communicates with the Hbasemaster to get a list of regions to serve and to tell the master that it is alive.
In our Example, one machine in the cluster is designated as Hbase master and Zookeeper. The rest of machine in the cluster act as a Regionserver.
INSTALLING AND CONFIGURING HBASE MASTER
1. Download hbase-1.1.2tar.gz from http://www.apache.org/dyn/closer.cgi/hbase/ and extract it in some directory in your computer. Now this path is called as $HBASE_INSTALL_DIR.
2. Edit the file /etc/hosts on the master machine and add the following lines.
192.168.35.16 bsw-HbaseMaster bsw-HbaseMaster
Hbase Master and Hadoop Namenode(master machine in hadoop clustering) is configure on same machine
192.168.35.17 bsw-data1 192.168.35.25 bsw-data2
Note: Run the command “ping bsw-HbaseMaster”. This command is run to check whether the bsw-HbaseMaster machine IP is being resolved to actual IP not localhost IP.
Here bsw-data1 and bsw-data2 are the machine where region server is running and bsw-HbaseMaster is the machine where hbase-master is running
3. We have needed to configure password less login from bsw-HbaseMaster to all regionserver machines.
Execute the following commands on bsw-HbaseMaster machine.
$ssh-keygen -t rsa $scp .ssh/id_rsa.pub hduser@bsw-data1/.ssh/authorized_keys $scp .ssh/id_rsa.pub hduser@bsw-data2/.ssh/authorized_keys
4. Open the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and set the $JAVA_HOME.
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
5. Open the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties.
hbase-master bsw-HbaseMaster:60000 The host and port that the HBase master runs at. hbase.rootdir hdfs://bsw-HbaseMaster:9000/hadoop-datastore The directory shared by region servers. hbase.cluster.distributed true Possible values are false: standalone and pseudo-distributed setups with managed Zookeeper true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh) hbase.zookeeper.property.clientPort 2181 hbase.zookeeper.quorum bsw-HbaseMaster
Note:-
In our Example, Zookeeper and hbase master both are running in same machine.
6. Open the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and uncomment the following line:
export HBASE_MANAGES_ZK=true
7. Open the file $HBASE_INSTALL_DIR/conf/regionservers and add all the regionserver machine names.
bsw-data1 bsw-data2 bsw-HbaseMaster
Note: Add bsw-HbaseMaster machine name only if you are running a regionserver on bsw-HbaseMaster machine.
INSTALLING AND CONFIGURING HBASE REGIONSERVER
1. Download hbase-1.1.2tar.gz from http://www.apache.org/dyn/closer.cgi/hbase/ and extract it in some directory in your computer. Now this path is called as $HBASE_INSTALL_DIR.
2. Edit the file /etc/hosts on the hbase-regionserver machine and add the following lines.
192.168.35.16 bsw-HbaseMaster bsw-HbaseMaster
Note: In my case, bsw-HbaseMaster and hadoop-namenode are running on same machine.
Note: Run the command “ping bsw-HbaseMaster”. This command is run to check whether the bsw-HbaseMaster machine IP is being resolved to actual IP not localhost IP.
3.We have needed to configure password less login from bsw-data1 and bsw-data2 to bsw-HbaseMaster machine.
Execute the following commands on bsw-data1 and bsw-data2 machine.
$ssh-keygen -t rsa $scp .ssh/id_rsa.pub hduser@bsw-HbaseMaster/.ssh/authorized_keys2
4. Open the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and set the $JAVA_HOME.
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
Note: If you are using open jdk , then give the path of open jdk.
5. Open the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties.
hbase-master bsw-HbaseMaster:60000 The host and port that the HBase master runs at. hbase.rootdir hdfs://bsw-HbaseMaster:9000/hadoop-datastore The directory shared by region servers. hbase.cluster.distributed true Possible values are false: standalone and pseudo-distributed setups with managed Zookeeper true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh) hbase.zookeeper.property.clientPort 2181 hbase.zookeeper.quorum bsw-HbaseMaster
6. Open the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and uncomment the following line:
export HBASE_MANAGES_ZK=true
Note:-
Above steps is required on all the datanode in the hadoop cluster.
START AND STOP HBASE CLUSTER
1. Starting the Hbase Cluster:-
we have need to start the daemons only on the bsw-HbaseMaster machine, it will start the daemons in all regionserver machines. Execute the following command to start the hbase cluster.
$HBASE_INSTALL_DIR/bin/start-hbase.sh
Note:-
At this point, the following Java processes should run on hbase-master machine.
hduser@bsw-HbaseMaster:$jps 14143 Jps 14007 HQuorumPeer 14066 HMaster
and the following java processes should run on hbase-regionserver machine.
23026 HRegionServer 23171 Jps
2. Starting the hbase shell:-
$HBASE_INSTALL_DIR/bin/hbase shell HBase Shell; enter 'help' for list of supported commands. Version: 0.20.6, r965666, Mon Jul 19 16:54:48 PDT 203 hbase(main):001:0>
Now,create table in hbase.
hbase(main):001:0>create 't1','f1' 0 row(s) in 1.2910 seconds hbase(main):002:0>
Note: – If table is created successfully, then everything is running fine.
3. Stoping the Hbase Cluster:-
Execute the following command on hbase-master machine to stop the hbase cluster.
$HBASE_INSTALL_DIR/bin/stop-hbase.sh
About us
We are a team of passionate technologists who love to work on new challenges and are ready to work on any of your technology related needs.
Contact Info
5023 W. 120th Ave. #111, Broomfield, CO 80020
info@buffalosw.com
(720) 445-5370
Contact Us
Copyright 2013. All rights reserved
2016年1月26日星期二
build hadoop-2.7.1 from source code on Ubuntu-15.10
1. Download hadoop-2.7.1-src.tar.gz and untar it
2. Folloing BUILDING.txt to install dependencies
3. But instead of installing oracle-java7-installer (which Oracle already restricts), install oracle-java8-installer
4. Do not install libprotobuf-dev and protobuf-compiler from apt-get, as it will pull version 2.6.1, but this version of hadoop requires 2.5.0. Instead download protobuf-2.5.0 from web, and run protobuf_arm64_patch.sh (attached below) to patch it, then do './configure; make; make install; ldconfig'
5. Do 'cd hadoop-maven-plugins; mvn install' before building hadoop. This is required for building any hadoop modules (not just eclipse support), otherwise you will run into mvn plugin error
6. Run 'mvn clean install -DskipTests -Pdist -Pnative' to build hadoop, you should find hadoop-2.7.1 directory under hadoop-dist/target
7. Then follow http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html for single node setup. Remember to modify hadoop-env.sh to set JAVA_HOME variable
Contents of protobuf_arm64_patch.sh:
cd protobuf-2.5.0/
wget https://gist.github.com/BennettSmith/7111094/raw/171695f70b102de2301f5b45d9e9ab3167b4a0e8/0001-Add-generic-GCC-support-for-atomic-operations.patch -O /tmp/0001-Add-generic-GCC-support-for-atomic-operations.patch
wget https://gist.github.com/BennettSmith/7111094/raw/a4e85ffc82af00ae7984020300db51a62110db48/0001-Add-generic-gcc-header-to-Makefile.am.patch -O /tmp/0001-Add-generic-gcc-header-to-Makefile.am.patch
patch -p1 < /tmp/0001-Add-generic-GCC-support-for-atomic-operations.patch
patch -p1 < /tmp/0001-Add-generic-gcc-header-to-Makefile.am.patch
rm /tmp/0001-Add-generic-GCC-support-for-atomic-operations.patch
rm /tmp/0001-Add-generic-gcc-header-to-Makefile.am.patch
2. Folloing BUILDING.txt to install dependencies
3. But instead of installing oracle-java7-installer (which Oracle already restricts), install oracle-java8-installer
4. Do not install libprotobuf-dev and protobuf-compiler from apt-get, as it will pull version 2.6.1, but this version of hadoop requires 2.5.0. Instead download protobuf-2.5.0 from web, and run protobuf_arm64_patch.sh (attached below) to patch it, then do './configure; make; make install; ldconfig'
5. Do 'cd hadoop-maven-plugins; mvn install' before building hadoop. This is required for building any hadoop modules (not just eclipse support), otherwise you will run into mvn plugin error
6. Run 'mvn clean install -DskipTests -Pdist -Pnative' to build hadoop, you should find hadoop-2.7.1 directory under hadoop-dist/target
7. Then follow http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html for single node setup. Remember to modify hadoop-env.sh to set JAVA_HOME variable
Contents of protobuf_arm64_patch.sh:
cd protobuf-2.5.0/
wget https://gist.github.com/BennettSmith/7111094/raw/171695f70b102de2301f5b45d9e9ab3167b4a0e8/0001-Add-generic-GCC-support-for-atomic-operations.patch -O /tmp/0001-Add-generic-GCC-support-for-atomic-operations.patch
wget https://gist.github.com/BennettSmith/7111094/raw/a4e85ffc82af00ae7984020300db51a62110db48/0001-Add-generic-gcc-header-to-Makefile.am.patch -O /tmp/0001-Add-generic-gcc-header-to-Makefile.am.patch
patch -p1 < /tmp/0001-Add-generic-GCC-support-for-atomic-operations.patch
patch -p1 < /tmp/0001-Add-generic-gcc-header-to-Makefile.am.patch
rm /tmp/0001-Add-generic-GCC-support-for-atomic-operations.patch
rm /tmp/0001-Add-generic-gcc-header-to-Makefile.am.patch
订阅:
博文 (Atom)