Do big data stream processing in the stream way

Reading: Years in Big Data. Months with Apache Flink. 5 Early Observations With Stream Processing: https://data-artisans.com/blog/early-observations-apache-flink. The article suggest adopting the right solution, Flink, for big data processing. Flink is interesting and built for stream processing. The broader view and take away may be to solve problems using the right solution. We saw many painful […]

What are the differences between NUMA architecture and SMP architecture?

NUMA Architecture: Non-Uniform Memory Access architecture. SMP: Symmetric Multiprocessing architecture. In a Symmetric Multiprocessor, the architectural “distance” to any memory location is the same for all processors, i.e. “symmetric”. In a NonUniform Memory Access machine, each processor is “closer” to some memory locations than others; i.e. memory is partitioned among them Asymmetrically. From my understanding, […]

What is the difference between work conserving I/O scheduler and non-work conserving I/O scheduler?

What is the difference between work conserving I/O scheduler and non-work conserving I/O scheduler? In a work-conserving mode, the scheduler must choose one of the pending requests, if any, to dispatch, even if the pending requests are far away from the current disk head position. The rationale for non-work-conserving schedulers, such as the anticipatory scheduler […]

How to handle missing blocks and blocks with corrupt replicas in HDFS?

One of HDFS cluster’s hdfs dfsadmin -report reports: Under replicated blocks: 139016 Blocks with corrupt replicas: 9 Missing blocks: 0 The “Under replicated blocks” can be re-replicated automatically after some time. How to handle the missing blocks and blocks with corrupt replicas in HDFS? Understanding these blocks A block is called corrupt by HDFS if […]

HDFS stays in safe mode because of reported blocks not reaching 0.9990 of total blocks

After a node failure and restarting the HDFS, the NameNode reports: “The reported blocks 1968810 needs additional 5071 blocks to reach the threshold 0.9990 of total blocks 1975856. Safe mode will be turned off automatically.” in the log. Why this happens? And how to fix it? About why the NameNode stays in the safe mode: […]

What are the differences between database DDL and DML

Differences beween DDL (Data Definition Language) and DML (Data Manipulation Language) Data Definition Language (DDL) statements are used to define the database structure or schema. Data Manipulation Language (DML) statements are used for managing data within schema objects. References:http://www.orafaq.com/faq/what_are_the_difference_between_ddl_dml_and_dcl_commands Answered by harryxiyou.

What are the DDL and DML of Shark (Spark SQL)?

Currently, I wanna take Shark’s (Spark SQL) DDL and DML as an reference to design/implement SQLE’s DDL and DML. However, I cannot find its DDL and DML. I can only find several SQLs in Shark paper[1]. [1] shark paper – http://tab.d-thinker.org/showthread.php?tid=2585 Shark’s language is Hive QL. HQL’s DDL and DML can be found at Hive […]

Systems Conferences

Which ones are good systems conferences? Top ones by ACM and USENIX: OSDI: https://www.usenix.org/conferences/byname/179 SOSP: http://sosp.org/ Other SIGOPS Events: http://www.sigops.org/conf-sponsored.html EuroSys: http://www.eurosys.org/ SoCC: http://www.socc2013.org/ (SoCC 2013) ASPLOS: http://www.sigplan.org/Conferences/ASPLOS/Main VEE: http://www.sigplan.org/vee.htm USENIX ATC: https://www.usenix.org/conferences/byname/131 NSDI: https://www.usenix.org/conferences/byname/178 Answered by anonymous. IEEE Conferences: ICDCS: http://www.temple.edu/cis/icdcs2013/ (2013) IPDPS: http://www.ipdps.org/ Other related ones and workshops: HPCA: Search HPCA ConferenceSC: http://www.supercomp.org/IEEE […]