Series ISSN: 2367-2005
Number 1 (August 18, 2023)
Original publisher: OpenProceedings.org, ISBN: 978-3-89318-091-2, Electronic Edition
Research Track
Balancing Utility and Fairness in Submodular Maximization
Yanhao Wang, Yuchen Li, Francesco Bonchi, Ying Wang
pp. 1–14
Stateful Entities: Object-oriented Cloud Applications as Distributed Dataflows
Kyriakos Psarakis, Wouter Zorgdrager, Marios Fragkoulis, Guido Salvaneschi, Asterios Katsifodimos
pp. 15–21
WDC Products: A Multi-Dimensional Entity Matching Benchmark
Ralph Peeters, Reng Chiz Der, Christian Bizer
pp. 22–33
In-Network Approximate and Efficient Spatiotemporal Range Queries on Moving Objects
Guang Yang, Abhirup Ghosh, Liang Liang, Thomas Heinis
pp. 34–46
Data Coverage for Detecting Representation Bias in Image Datasets: A Crowdsourcing Approach
Melika Mousavi, Nima Shahbazi, Abolfazl Asudeh
pp. 47–60
PyFroid: Scaling Data Analysis on a Commodity Workstation
Venkatesh Emani, Avrilia Floratou, Carlo Curino
pp. 61–67
A new PET for Data Collection via Forms with Data Minimization, Full Accuracy and Informed Consent
Nicolas Anciaux, Sabine Frittella, Baptiste Joffroy, Benjamin Nguyen, Guillaume Scerri
pp. 81–93
Computing Generic Abstractions from Application Datasets
Nelly Barret, Ioana Manolescu, Prajna Upadhyay
pp. 94–107
Data-CASE: Grounding Data Regulations for Compliant Data Processing Systems
Vishal Chakraborty, Stacy Ann-Elvy, Sharad Mehrotra, Faisal Nawab, Mohammad Sadoghi, Shantanu Sharma, Nalini Venkatasubramanian, Farhan Saeed
pp. 108–115
DSPC: Efficiently Answering Shortest Path Counting on Dynamic Graphs
Qingshuai Feng, You Peng, Wenjie Zhang, Xuemin Lin, Ying Zhang
pp. 116–128
Experiments & Analyses Track
Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data
Danrui Qi, Jinglin Peng, Yongjun He, Jiannan Wang
pp. 129–142
Number 2 (November 22, 2023)
Original publisher: OpenProceedings.org, ISBN: 978-3-89318-094-3, Electronic Edition
Research Track
Adaptive Compression for Databases
Leon Windheuser, Christoph Anneser, Huanchen Zhang, Thomas Neumann, Alfons Kemper
pp. 143–149
Fair Spatial Indexing: A paradigm for Group Spatial Fairness
Sina Shaham, Gabriel Ghinita, Cyrus Shahabi
pp. 150–161
Interactive Graph Repairs for Neighborhood Constraints
Paul Juillard, Angela Bonifati, Andrea Mauri
pp. 175–187
Benchmarking Stream Join Algorithms on GPUs: A Framework and its Application to the State-of-the-art
Dwi P. A. Nugroho, Philipp M Grulich, Steffen Zeuch, Clemens Lutz, Stefano Bortoli, Volker Markl
pp. 188–200
CSR+: A Scalable Efficient CoSimRank Search Algorithm with Multi-Source Queries on Massive Graphs
Maoyin Zhang, Weiren Yu
pp. 201–207
FALCC: Efficiently performing locally fair and accurate classifications
Nico Lässig, Melanie Herschel
pp. 208–220
Relational Data Imputation with Graph Neural Networks
Riccardo Cappuzzo, Saravanan Thirumuruganathan, Paolo Papotti
pp. 221–233
Seraph: Continuous Queries on Property Graph Streams
Christopher Rost, Riccardo Tommasini, Angela Bonifati, Emanuele Della Valle, Erhard Rahm, Keith W. Hare, Stefan Plantikow, Petra Selmer, Hannes Voigt
pp. 234–247
GSim+: Efficient Retrieval of Node-to-Node Similarity Across Two Graphs at Billion Scale
Ruby Zhang, Weiren Yu
pp. 248–254
MCF-KV: Multi-Cuckoo-Filter Index based Key-Value Store with Persistent Memory
Hongjia Zou, Lidan Shou, Ke Chen, Xuan Zhou
pp. 255–267
Efficient Semantic Similarity Search over Spatio-textual Data
George S. Theodoropoulos, Kjetil Nørvåg, Christos Doulkeridis
pp. 268–280
EMBA: Entity Matching using Multi-Task Learning of BERT with Attention-over-Attention
Jing Zhang, Huan Sun, Joyce C Ho
pp. 281–293
(Privately) Estimating Linkage Quality for Record Linkage
Martin Franke, Victor Christen, Peter Christen, Florens Rohde, Erhard Rahm
pp. 294–306
QPSeeker: An Efficient Neural Planner combining both data and queries through Variational Inference
Christos Tsapelas, Georgia Koutrika
pp. 307–319
Size-bounded Community Search over Large Bipartite Graphs
Yuting Zhang, Kai Wang, Wenjie Zhang, Wei Ni, Xuemin Lin
pp. 320–331
Optimizing Goodput through Sharing for Batch Analytics with Deadlines
Srinivas Karthik, Panagiotis Sioulas, Ahana Pradhan, Raghunandan Subramanya, Ioannis Mytilinis, Anastasia Ailamaki
pp. 332–344
Experiments & Analyses Track
Analysis of Open Government Datasets From a Data Design and Integration Perspective
Arif Usta, Chang Liu, Semih Salihoğlu
pp. 345–358
Vision Papers
The Missing Link? On the In-Between Instance Detection Task
Daniyal Kazempour, Claudius Zelenka, Peer Kröger
pp. 359–364
Number 3 (March 18, 2024)
Original publisher: OpenProceedings.org, ISBN: 978-3-89318-095-0, Electronic Edition
Research Track
Fine-Grained Geo-Obfuscation to Protect Workers’ Location Privacy in Time-Sensitive Spatial Crowdsourcing
Chenxi Qiu, Sourabh Yadav, Yuede Ji, Anna Squicciarini, Ram Dantu, Juanjuan Zhao, Cheng-Zhong Xu
pp. 373–385
SAGED: Few-Shot Meta Learning for Tabular Data Error Detection
Mohamed Abdelaal, Tim Ktitarev, Daniel Städtler, Harald Schöning
pp. 386–398
Efficient Discovery of Temporal Inclusion Dependencies in Wikipedia Tables
Leon Bornemann, Tobias Bleifuß, Dmitri V. Kalashnikov, Fatemeh Nargesian, Felix Naumann, Divesh Srivastava
pp. 399–411
Deco: Fast and Accurate Decentralized Aggregation of Count-Based Windows in Large-Scale IoT Applications
Wang Yue, Rafael Moczalla, Manisha Luthra, Tilmann Rabl
pp. 412–425
TASHEEH: Repairing Row-Structure in Raw CSV Files
Mazhar Hameed, Gerardo Vitagliano, Fabian Panse, Felix Naumann
pp. 426–439
HINT on Steroids: Batch Query Processing for Interval Data
Panagiotis Bouros, Artur Titkov, George Christodoulou, Christian Rauch, Nikos Mamoulis
pp. 440–446
Bridging the Gap: Complex Event Processing on Stream Processing Systems
Ariane Ziehn, Philipp M. Grulich, Steffen Zeuch, Volker Markl
pp. 447–460
Similarity Measures For Incomplete Database Instances
Boris Glavic, Giansalvatore Mecca, Renée J. Miller, Paolo Papotti, Donatello Santoro, Enzo Veltri
pp. 461–473
TaC: An Anti-Caching Key-Value Store on Heterogeneous Memory Architectures
Yunhong Ji, Wentao Huang, Xuan Zhou, Bingsheng He, Kian-Lee Tan
pp. 474–487
Spatial-temporal Forecasting for Regions without Observations
Xinyu Su, Jianzhong Qi, Egemen Tanin, Yanchuan Chang, Majid Sarvi
pp. 488–500
Aion: Efficient Temporal Graph Data Management
Georgios Theodorakis, James Clarkson, Jim Webber
pp. 501–514
Loss Compensation in Multi-Session Recommendation Under Limited Availability
Davide Azzalini, Fabio Azzalini, Chiara Criscuolo, Tommaso Dolci, Davide Martinenghi, Sihem Amer-Yahia
pp. 522–533
ORTOA: A Family of One Round Trip Protocols For Operation-Type Obliviousness
Sujaya Maiyya, Yuval Steinhart, Adrian Davila, Jason Du, Divyakant Agrawal, Prabhanjan Ananth, Amr El Abbadi
pp. 534–546
Efficient Proximity Search in Time-accumulating High-dimensional Data using Multi-level Block Indexing
Changhun Han, Suji Kim, Ha-Myung Park
pp. 547–558
WaZI: A Learned and Workload-aware Z-Index
Sachith Pai, Michael Mathioudakis, Yanhao Wang
pp. 559–571
Community Similarity based on User Profile Joins
Konstantinos Theocharidis, Hady W. Lauw
pp. 572–583
Shapley Values for Explanation in Two-sided Matching Applications
Suraj Shetiya, Ian P. Swift, Abolfazl Asudeh, Gautam Das
pp. 584–596
Optimizing Counterfactual-based Analysis of Machine Learning Models Through Databases
Aviv Ben Arie, Daniel Deutch, Nave Frost, Yair Horesh, Idan Meyuhas
pp. 597–609
Similarity Search based on Geo-footprints
Achilleas Michalopoulos, Konstantinos Lampropoulos, George Kelantonakis, Chrysostomos Zeginis, Kostas Magoutis, Nikos Mamoulis
pp. 610–616
Frequent Component Analysis for Large Time Series Databases with Gaussian Processes
Jan David Hüwel, Christian Beecks
pp. 617–622
Experiments & Analyses Track
A Framework to Evaluate Early Time-Series Classification Algorithms
Charilaos Akasiadis, Evgenios Kladis, Petro-Foti Kamberi, Evangelos Michelioudakis, Elias Alevizos, Alexander Artikis
pp. 623–635
Deep Clustering for Data Cleaning and Integration
Hafiz Tayyab Rauf, André Freitas, Norman W. Paton
pp. 636–649
Evaluating the Impact of Error-Bounded Lossy Compression on Time Series Forecasting
Carlos Enrique Muniz-Cuza, Søren Kejser Jensen, Jonas Brusokas, Nguyen Ho, Torben Bach Pedersen
pp. 650–663
Evaluation of Sampling Methods for Discovering Facts from Knowledge Graph Embeddings
Rama Widyadhana Bhagaskoro, Volker Markl, Zoi Kaoudi
pp. 664–675
Crayfish: Navigating the Labyrinth of Machine Learning Inference in Stream Processing Systems
Sonia Horchidan, Po Hao Chen, Emmanouil Kritharakis, Paris Carbone, Vasiliki Kalavri
pp. 676–689
Performance Analysis of Distributed GPU-Accelerated Task-Based Workflows
Marcos N. L. Carvalho, Anna Queralt, Oscar Romero, Alkis Simitsis, Cristian Tatu, Rosa M. Badia
pp. 690–703
Predicting Fact Contributions from Query Logs with Machine Learning
Dana Arad, Daniel Deutch, Nave Frost
pp. 704–716
Vision Papers
Serving Deep Learning Models from Relational Databases
Lixi Zhou, Qi Lin, Kanchan Chowdhury, Saif Masood, Alexandre Eichenberger, Hong Min, Alexander Sim, Jie Wang, Yida Wang, Kesheng Wu, Binhang Yuan, Jia Zou
pp. 717–724
Industrial & Applications Track
Pythagoras: Semantic Type Detection of Numerical Data in Enterprise Data Lakes
Sven Langenecker, Christoph Sturm, Christian Schalles, Carsten Binnig
pp. 725–733
Private and Efficient Federated Numerical Aggregation
Graham Cormode, Igor L. Markov, Harish Srinivas
pp. 734–742
MyRaft: High Availability in MySQL using Raft
Anirban Rahut, Vinaykumar Bhat, Abhinav Sharma, Yichen Shen, Bartlomiej Pelc, Chi Li, Ahsanul Haque, Yash Botadra, Xi Wang, Michael Percy, Ritwik Yadav, Yoshinori Matsunobu, Alan Liang, Igor Pozgaj, Tobias Asplund, Anatoly Karp, Luqun Lou, Pushap Goyal
pp. 743–752
Exploring unsupervised anomaly detection for vehicle predictive maintenance with partial information
Apostolos Giannoulidis, Anastasios Gounaris, Ioannis Constantinou
pp. 753–761
A Scalable System for Maritime Route and Event Forecasting
Georgios Grigoropoulos, Giannis Spiliopoulos, Ilias Chamatidis, Manolis Kaliorakis, Alexandros Troupiotis-Kapeliaris, Marios Vodas, Evangelia Filippou, Eva Chondrodima, Nikos Pelekis, Yannis Theodoridis, Dimitris Zissis, Konstantina Bereta
pp. 762–769
Patterns of Life : Global Inventory for maritime mobility patterns
Giannis Spiliopoulos, Marios Vodas, Georgios Grigoropoulos, Konstantina Bereta, Dimitris Zissis
pp. 770–777
Demonstration Track
HEADWORK: a Data-centric Crowdsourcing Platform for Complex Tasks and Participants
David Gross-Amblard, Marion Tommasi, Iandry Rakotoniaina, Constance Thierry, Rituraj Singh, Leo Jacoboni
pp. 778–781
MAGNETO: Edge AI for Human Activity Recognition - Privacy and Personalization
Jingwei Zuo, George Arvanitakis, Mthandazo Ndhlovu, Hakim Hacid
pp. 782–785
Demonstration of Link Traversal SPARQL Query Processing over the Decentralized Solid Environment
Ruben Taelman, Ruben Verborgh
pp. 786–789
Breaking Down Accuracy with Subspace Optimization
Donatella Firmani, Giorgio Grani, Flavia Tagliafierro
pp. 790–793
GraphSUM: Scalable Graph Summarization for Efficient Question Answering
Nasrin Shabani, Amin Beheshti, Jia Wu, Maryam Khanian Najafabadi, Jin Foo, Alireza Jolfaei
pp. 794–797
GMSA: A Digital Twin Application for Maritime Route and Event Forecasting
Georgios Grigoropoulos, Giannis Spiliopoulos, Ilias Chamatidis, Manolis Kaliorakis, Alexandros Troupiotis-Kapeliaris, Marios Vodas, Evangelia Filippou, Eva Chondrodima, Nikos Pelekis, Yannis Theodoridis, Dimitris Zissis, Konstantina Bereta
pp. 798–801
MIP: Advanced Data Processing and Analytics for Science and Medicine
Kostas Filippopolitis, Yannis Foufoulas, Minos Garofalakis, Apostolos Glenis, Yannis Ioannidis, Thanasis-Michail Karampatsis, Maria-Olympia Katsouli, Evdokia Mailli, Asimakis Papageorgiou-Mariglis, Giorgos Papanikos, George Pikramenos, Jason Sakellariou, Alkis Simitsis, Pauline Ducouret, Philippe Ryvlin, Manuel-Guy Spuhler
pp. 802–805
DriftLens: A Concept Drift Detection Tool
Salvatore Greco, Bartolomeo Vacchetti, Daniele Apiletti, Tania Cerquitelli
pp. 806–809
MEOS: An Open Source Library for Mobility Data Management
Esteban Zimányi, Mariana Duarte, Víctor Diví
pp. 810–813
BOLD: Knowledge Graph Exploration and Analysis Platform
Egor Dmitriev, Melisachew Wudage Chekol, Mirko Tobias Schaefer
pp. 814–817
MM-evoque: Query Synchronisation in Multi-Model Databases
Pavel Koupil, Jáchym Bártík, Irena Holubová
pp. 818–821
FACT-DM: A Framework for Automated Cost-Based Data Model Transformation
Jihane Mali, Shohreh Ahvar, Faten Atigui, Ahmed Azough, Nicolas Travers
pp. 822–825
How to Make your Duck Fly: Advanced Floating Point Compression to the Rescue
Panagiotis Liakos, Katia Papakonstantinopoulou, Thijs Bruineman, Mark Raasveldt, Yannis Kotidis
pp. 826–829
GizaML: A Collaborative Meta-learning Based Framework Using LLM For Automated Time-Series Forecasting
Esraa Sayed, Mohamed Maher, Omar Sedeek, Ahmed Eldamaty, Amr Kamel, Radwa El Shawi
pp. 830–833
"Please, Vadalog, tell me why": Interactive Explanation of Datalog-based Reasoning
Teodoro Baldazzi, Luigi Bellomarini, Stefano Ceri, Andrea Colombo, Andrea Gentili, Emanuel Sallinger
pp. 834–837
NeoSGG: A Scene Graph Generation Framework for Video-Surveillance Tasks
Pierre Lefebvre, Steven Le Moal, Ahmed Azough, Nicolas Travers
pp. 838–841
Logica: Declarative Data Science for Mere Mortals
Evgeny Skvortsov, Yilin Xia, Bertram Ludäscher
pp. 842–845
Tutorial Track
Finding Relevant Information in Big Datasets with ML
Uchechukwu F. Njoku, Alberto Abelló, Besim Bilalli, Gianluca Bontempi
pp. 846–849
A Differentially Private Guide for Graph Analytics
Felipe T. Brito, André L. C. Mendonça, Javam C. Machado
pp. 850–853
Dataset Discovery and Exploration: State-of-the-art, Challenges and Opportunities
Norman W. Paton, Zhenyu Wu
pp. 854–857