川島 英之 (カワシマ ヒデユキ)

Kawashima, Hideyuki

写真a

所属(所属キャンパス)

環境情報学部 (湘南藤沢)

職名

准教授

HP

 

論文 【 表示 / 非表示

  • Scalable distributed metadata server based on nonblocking transactions

    Hiraga K., Tatebe O., Kawashima H.

    Journal of Universal Computer Science (Journal of Universal Computer Science)  26 ( 1 ) 89 - 106 2020年

    ISSN  0948695X

     概要を見る

    Metadata performance scalability is critically important in high-performance computing when accessing many small files from millions of clients. This paper proposes a design of a scalable distributed metadata server, PPMDS, for parallel file systems using multiple key-value servers. In PPMDS, hierarchical namespace of a file system is efficiently managed by multiple servers. Multiple entries can be atomically updated using a nonblocking distributed transaction based on an algorithm of dynamic software transactional memory. This paper also proposes optimizations to further improve the metadata performance by introducing a server-side transaction processing, multiple readers, and a shared lock mode, which reduce the number of remote procedure calls and prevent unnecessary blocking.Performance evaluation shows the scalable performance up to 3 servers, and achieves 62,000 operations per second, which is 2.58x performance improvement compared to a single metadata performance.

  • An analysis of concurrency control protocols for in-memory databases with ccbench

    Tanabe T., Hoshino T., Kawashima H., Tatebe O.

    Proceedings of the VLDB Endowment (Proceedings of the VLDB Endowment)  13 ( 13 ) 3531 - 3544 2020年

     概要を見る

    This paper presents yet another concurrency control analysis platform, CCBench. CCBench supports seven protocols (Silo, TicToc, MOCC, Cicada, SI, SI with latch-free SSN, 2PL) and seven versatile optimization methods and enables the configuration of seven workload parameters. We analyzed the protocols and optimization methods using various workload parameters and a thread count of 224. Previous studies focused on thread scalability and did not explore the space analyzed here. We classified the optimization methods on the basis of three performance factors: CPU cache, delay on conflict, and version lifetime. Analyses using CCBench and 224 threads, produced six insights. (I1) The performance of optimistic concurrency control protocol for a readonly workload rapidly degrades as cardinality increases even without L3 cache misses. (I2) Silo can outperform TicToc for some write-intensive workloads by using invisible reads optimization. (I3) The effectiveness of two approaches to coping with conflict (wait and no-wait) depends on the situation. (I4) OCC reads the same record two or more times if a concurrent transaction interruption occurs, which can improve performance. (I5) Mixing different implementations is inappropriate for deep analysis. (I6) Even a state-of-the-art garbage collection method cannot improve the performance of multi-version protocols if there is a single long transaction mixed into the workload. On the basis of I4, we defined the read phase extension optimization in which an artificial delay is added to the read phase. On the basis of I6, we defined the aggressive garbage collection optimization in which even visible versions are collected. The code for CCBench and all the data in this paper are available online at GitHub.

  • Accelerating Sequence Operator with Reduced Expression

    Kawashima H., Tatebe O.

    Frontiers in Artificial Intelligence and Applications (Frontiers in Artificial Intelligence and Applications)  321   71 - 82 2019年12月

    ISSN  9781643680446

     概要を見る

    Sequence operators are effective for efficiently combining multiple events when state recognition is performed by combining time series events. Since sensor data are inherently noisy, one can take a strict attitude to deal with them: It is conceivable that all of time series events are regarded as false positives. Then, all complex events should be constructed carefully. Such an attitude is called the skip-till-any-match model in the sequence operator. When using this model, huge amounts of potential complex events are generated. A sequence operator usually supports both Kleene closure and non-Kleene closure. While efficient methods have been studied for Kleene closure so far, that for non-Kleene closure have been still explored. In this paper, we propose the reduced expression method to improve the efficiency of sequence operator processing for the skip-till-any-match model. Experimental results showed that the processing time and memory size were more efficient compared with SASE, which is the conventional method, and that degree is up to several thousand times.

  • Skew-Aware Collective Communication for MapReduce Shuffling

    Daikoku H., Kawashima H., Tatebe O.

    Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018 (Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018)     3331 - 3340 2019年01月

    ISSN  9781538650356

     概要を見る

    This paper proposes and examines the three in-memory shuffling methods designed to address problems in MapReduce shuffling caused by skewed data. Coupled Shuffle Architecture (CSA) employs a single pairwise all-to-all exchange to shuffle both blocks, units of shuffle transfer, and meta-blocks, which contain the metadata of corresponding blocks. Decoupled Shuffle Architecture (DSA) separates the shuffling of meta-blocks and blocks, and applies different all-to-all exchange algorithms to each shuffling process, attempting to mitigate the impact of stragglers in strongly skewed distributions. Decoupled Shuffle Architecture with Skew-Aware Meta-Shuffle (DSA w/ SMS) autonomously determines the proper placement of blocks based on the memory consumption of each worker process. This approach targets extremely skewed situations where some worker processes could exceed their node memory limitation. This study evaluates implementations of the three shuffling methods in our prototype in-memory MapReduce engine, which employs high performance interconnects such as InfiniBand and Intel Omni-Path. Our results suggest that DSA w/ SMS is the only viable solution for extremely skewed data distributions, but this solution is only valid on systems equipped with high performance interconnects. We also present a detailed investigation of the performance of CSA and DSA in various skew situations.

  • Skew-aware collective communication for MapReduce shuffling

    Daikoku H., Kawashima H., Tatebe O.

    IEICE Transactions on Information and Systems (IEICE Transactions on Information and Systems)  E102D ( 12 ) 2389 - 2399 2019年

    ISSN  09168532

     概要を見る

    This paper proposes and examines the three in-memory shuffling methods designed to address problems in MapReduce shuffling caused by skewed data. Coupled Shuffle Architecture (CSA) employs a single pairwise all-to-all exchange to shuffle both blocks, units of shuffle transfer, and meta-blocks, which contain the metadata of corresponding blocks. Decoupled Shuffle Architecture (DSA) separates the shuffling of meta-blocks and blocks, and applies different all-to-all exchange algorithms to each shuffling process, attempting to mitigate the impact of stragglers in strongly skewed distributions. Decoupled Shuffle Architecture with Skew-Aware Meta-Shuffle (DSA w/ SMS) autonomously determines the proper placement of blocks based on the memory consumption of each worker process. This approach targets extremely skewed situations where some worker processes could exceed their node memory limitation. This study evaluates implementations of the three shuffling methods in our prototype in-memory MapReduce engine, which employs high performance interconnects such as InfiniBand and Intel Omni-Path. Our results suggest that DSA w/ SMS is the only viable solution for extremely skewed data distributions. We also present a detailed investigation of the performance of CSA and DSA in various skew situations.

全件表示 >>

KOARA(リポジトリ)収録論文等 【 表示 / 非表示

競争的資金等の研究課題 【 表示 / 非表示

  • データ集約型科学に資するリアルタイムデータカーネルの創出

    2019年04月
    -
    2022年03月

    文部科学省・日本学術振興会, 科学研究費助成事業, 川島 英之, 基盤研究(B), 補助金,  代表

 

担当授業科目 【 表示 / 非表示

  • 研究会B

    2021年度

  • 最適化の数理

    2021年度

  • 修士研究会

    2021年度

  • データベース概論

    2021年度

  • 特別研究

    2021年度

全件表示 >>