Page MenuHomePhabricator

Isaac support for WikiNLP Workshop
Closed, ResolvedPublic

Description

Isaac's support for:

Event Timeline

Weekly updates:

My summary of takeaways from running this first edition:

  • Things that went really well:
    • Metawiki as conference page: very easy to update (by anyone); free pageview tracking; very on-point
    • OpenReview Help Desk is wonderful – use them!
    • I enjoyed shepherding in one of our paper even though it took a fair bit of work
    • 2-minute pre-recorded lightning talks is good length
    • Day-of document I think is very helpful for forcing organizers to be concrete and providing info to participants
    • Our speakers were great -- I really enjoyed the content! Same with our support folks!
    • Gather was really nice for the virtual poster session.
  • Things that were hard the first go-around but will be easier with experience:
    • I now understand how OpenReview mostly works but it had a pretty steep learning curve
    • We underestimated the review load -- I'll work to get more reviewers in advance this time
    • Checking camera-readies is a pain. Make sure to have the automatic checks available and follow up early with folks. Perhaps even give guidance for simple mistakes not to make.
    • None of the organizers being in-person (hopefully not a repeat; that was just pure bad luck)
  • Things that were hard and won't necessarily get easier:
    • Having lots of different types of papers etc. makes the proceedings and getting folks the right instructions difficult. OpenReview is only really useful if you're emailing individual authors or everyone. I'll have to think through ways to do multi-track a bit better there.
    • EMNLP is expensive (even for virtual workshop alone)
    • Hybrid is difficult – really either have to be in-person with virtual watching or run two parallel workshops. If virtual was very cheap, I wouldn't feel so bad about it being a degraded experience but it's still quite expensive so we really need to do it well if we're allowing it.
    • We don't have an email list for everyone attending which makes it difficult to do things like remind folks, share out details in advance to get them excited, and sharing day-of-doc to get everyone in the notes. We'll have to think if there are ways around that.
  • Things I intend to do differently:
    • Running a full review process is difficult – you have to recruit a lot of reviewers, remind them, do meta-reviews, curate the proceedings, etc. And the value add isn't huge (best argument I've heard is that peer-review helps for getting funding to attend). I think better to either run a more focused review process with standards that are closer to a conference (expect not a lot of papers but hopefully high-quality ones) or to mostly work with non-archival work.
    • Saturday workshop was hard for participation. In general the folks that showed up did engage though and obviously that was a lot of paper authors. I want to think through how to get lots of people to submit something (however small) to the workshop so they've made that small commitment to attending and have something to discuss while they're at the workshop.
    • Making space for authors who do have archival papers to share. The poster session was good but I'd like to spotlight more of them. Maybe even summarize the work ourselves (this is what TREC does and I really like it).
    • Logistics of tracking papers/authors is hard. I think having a spreadsheet outside of OpenReview will help a lot with this.
    • Planning the schedule and coordinating speakers was a challenge. I think having more discussion options will make it more participatory and simpler.

Updates:

  • Working on the CFP with the goal to release around the end of the month. Beyond getting the wording right and setting up OpenReview, the big lift here is defining the datasets that we want folks to submit (our core peer-reviewed track). Our focus is going to be Core Content Policies or anything that can be directly tied to the Community Wishlist, but we need to curate some examples to get folks started and be clear about the sorts of work that we're hoping to see.
Isaac triaged this task as High priority.Jan 28 2025, 5:26 PM

Updates:

  • We moved back the deadline by a week (now April 30th)
  • Offering office hours explicitly to folks intending to submit has been helpful I think. I've had two folks use them now and I was able to give some ideas of work to be aware of and important framing considerations for the workshop.
  • About to do another round of reminders about the workshop

Updates:

  • Sent out review requests
  • Registration has opened so will be looking into that
  • ACL reviews out so we can look for other Wikimedia papers to invite as well

Updates:

  • Sent out paper notifications and working on reaching out authors of relevant Findings papers
  • Planning day-of-agenda now

Updates:

  • Proceedings accepted
  • Schedule largely finalized (both keynotes) but still looking for a local Wikimedian for a conversation/AMA. Turns out that Europe+August=Vacation is still alive and well :)

Finished! I'll record some of my takeaways below and then close this task out.

Three core challenges that I raised to the *CL workshop organizers that limit the value of organizing something like WikiNLP:

  • Participation suffers greatly by day 6: our workshop happened to be on Friday, which meant it was on Day 6 of a massive conference. At that point, many folks had physically left Vienna or cognitively were just exhausted (self included). I imagine that this hurts participation quite a bit (both in terms of numbers and energy) but it also takes away from what I see as one of the really important roles that workshops can serve: as a space to meet new researchers at the start of a conference. When workshops are at the start of a conference (as is the norm for various conferences run by ACM for example), you have time during the conference to build on these connections and meet folks whose talks/posters you might want to go see.
  • Content vs. overhead: I'm beginning to feel that I spend as much time ensuring that everyone pays ACL as I do trying to put together an engaging program for participants. I understand that conferences need money to operate but it has reached a point where it detracts from my ability to organize well. It shifts what should be a structural burden owned by ACL to an individual burden across all of the organizers. I advocated for soliciting fewer archival papers this year in our workshop because the overhead of building a "correct" proceedings and making sure that authors are registered to the paper did not feel worthwhile. A virtual invited panelist for 45 minutes cost $300. The tools provided by ACL to verify attendance etc. also have many gaps. For instance, we weren't given a potential attendance list until the week of (when we asked for it), which made it difficult to track who might attend, what to expect as far as in-person vs. virtual participation, and to contact folks who weren't paper authors.
  • Information sharing: there's a Google group for workshop organizers and I should have used it more during the organizing phase. Organizing is complicated and doesn't follow a specific pattern because every workshop is a bit different. But I think if I repeated the experience, I would try to do better to share/solicit approaches from the other organizers – e.g., how to best solicit papers from ARR to submit to your workshop; trade-offs between archival and non-archival; OpenReview tricks; how to best support virtual participants if your workshop is hybrid; etc.

In the interest of not just pointing out what was hard, I think it's also important to acknowledge some things that are working well:

  • This was my second workshop being run from OpenReview, and while it's easy to point out issues with the system, I found it a lot easier in this second iteration. It helps a lot to have yourself as a dummy author and reviewer to check visibility for the various stages and preview emails etc. Their help desk is also fantastic when issues do arise. Even as I complain about compiling the proceedings, the scripts have gotten better for that and again having prior experience makes it much simpler in future iterations.
  • Our participants really enjoyed the shared poster session and in general I find the poster sessions to be really nice ways to have deep conversations about your research that wouldn't happen otherwise.
  • The folks who do show up on Day 6 are excellent and I of course very much value the conversations that came out of the workshop and new researchers I was able to meet through it.

More fine-grained takeaways:

Things that went well

  • Metawiki as conference page: very easy to update (by anyone); free pageview tracking; very on-point
  • OpenReview Help Desk is wonderful – use them!
  • Day-of document I think is very helpful for forcing organizers to be concrete and providing info to participants
  • Keeping minimal archival papers made proceedings far simpler – only one author needed to make corrections and collecting copyright forms was simple
  • Having two keynotes being folks who were attending – that's partially luck but made it a lot better
  • Having good co-organizers! Three organizers in-person (and two remote)! Also being in-person was just so much nicer.
  • Keynotes were fantastic (Monica Lam and Matthias Gallé)!
    • Monica's was about specific Wikipedia work she did and so clear relevance.
    • Matthias was just about multilingual modeling in general, but the relevance to Wikipedia was clear and he's a good speaker
  • People love the in-person poster session

Things that get easier every year

  • As experience with OpenReview increases, a lot easier to work with
  • As experience with publishing proceedings increases, a lot easier to work with

Things that were hard but will be less hard

  • Don't overlap with WikiWorkshop – made it harder to recruit reviewers, advertise, make clear the value, recruit papers, etc.
  • Always recruit more reviewers than you think you need
  • Summer vacation made recruiting local Wikimedians hard

Things that were just hard

  • Workshop on Day 6 (!!!) of the conference – people are exhausted and attendance is low
  • ACL is expensive and pretty complicated to navigate
  • Hybrid is difficult – really either have to be in-person with virtual watching or run two parallel workshops
  • How to share out things like day-of doc or reminder of the workshop to encourage participation? We did have emails from swoogo so probably should have just planned to use those? No one used day-of-doc other than organizers.
  • Getting Wikimedian participation for this particular workshop. This is not on Wikimedians but I think WikiWorkshop was too close by and that attracts a lot of engagement so we were careful to not detract from that. And then with all the community conversations around AI+Wikimedia, I wanted to be careful to not add confusion by requesting that Wikimedians share how NLP might actually be useful to them (the initial framing of Track 1 in the workshop). Just something where I would need to think more long-term about the best way to collect this sort of input to be able to present to researchers.
  • Many many moving parts and deadlines to meet and information to fill out

Things that I want to do differently / consider

  • Can we piggyback on ARR like https://sites.google.com/view/nlp4positiveimpact/call-for-papers-2025 ? That would likely require expanding the scope beyond Wiki to e.g., Open Knowledge or Online Communities or something like that
  • Small discussion groups or something more interactive like WikiWorkshop?
  • Have a nice structured data source of truth for the papers so it's easier to track all the things that we need
  • Run a technology-mediated edit-a-thon?
  • Shepherding in more papers? I really enjoy where I can directly support authors in their submissions and would love to expand out this part.
  • Public reviews (opt-in)?
  • Sponsorship to pay for social event?
  • Do we need any archival papers?
  • Emailing registrants to the workshop to ask them to pre-submit a "why am I interested in this workshop" paragraph so we can try to address these but also have a list of committed emails to send reminder emails to etc.
  • Just run a tutorial that's very interactive?

Resolving this. A question of what this would look like for next year. There's a September 5th deadline for EACL/ACL: https://www.aclweb.org/portal/content/eaclacl-2026-joint-call-workshops but EMNLP/AACL will release a call later in the fall.