Gobblin
Some time ago Gobblin was reviewed by us: T111409. Notes here: https://etherpad.wikimedia.org/p/gobblin-sprint
Now the project is incubating into Apache:
It doesn't seem abandoned, but not super active either (compared to other projects like Airflow etc..). Camus shows some sign of age and we are still unsure if the HDFS connector will be usable in the near future, I think that we'd need to start evaluating Camus replacements.
Marmaray
Released by Uber. Like Gobblin but built on Spark instead of MapReduce. Uses Uber's Hudi for HDFS upserts. Has ability to export from HDFS as well. Not very active, documentation sparse.
Kafka Connect + Kafka Connect HDFS
Kafka Connect is a generic Kafka source & sink framework. Kafka Connect HDFS is a HDFS + Hive sink. Its license was changed from Apache to a non FLOSS Confluent Community License 1.5 years ago. Supports Hive schema evolution.