Page MenuHomePhabricator

Fix Newspapers.com Library Bundle Configuration issues
Open, Stalled, HighPublic

Assigned To
None
Authored By
sjvipin
Nov 11 2022, 10:48 AM
Referenced Files
Restricted File
Thu, Jul 4, 4:12 AM
F41721934: image.png
Jan 27 2024, 7:22 AM
F41436405: image.png
Nov 3 2023, 4:06 AM
F37819456: Screenshot_20230927-201747.png
Sep 28 2023, 3:25 AM
F37817066: image.png
Sep 27 2023, 1:15 PM
F36890219: image.png
Mar 3 2023, 2:32 PM
F36886800: image.png
Mar 1 2023, 1:33 PM
F36886136: Screenshot 2023-03-01 at 10.47.58.png
Mar 1 2023, 10:50 AM
Tokens
"Love" token, awarded by HouseBlaster."Like" token, awarded by Novem_Linguae.

Description

Current issues

  • We do not currently seem to have access to some Publisher Extra content
  • Users cannot log in via Ancestry
  • Users cannot log in via Facebook
  • Users cannot register a new account via Email

Past issues
Cloudflare is preventing access. Users get stuck at www.newspapers.com needs to review the security of your connection before proceeding.

Users currently cannot log out of their account while proxied, or log in via Ancestry.

This change has now been deployed, however we are seeing some access issues. Some users cannot retrieve search results, and logging in to accounts sometimes doesn't work.

Users can't login as they see the following error:

image.png (133×482 px, 13 KB)

Newspapers.com fixed the initial issue here, which was enabling reCaptcha, however we're now getting a slightly different error about an "invalid domain key". We should be able to fix this issue, and are investigating.

Original task
Our existing partner newspapers.com has agreed to move to library bundle. We need to update their EZProxy configuration.

Partner id: 26 (https://wikipedialibrary.wmflabs.org/partners/26/)

Before making this change please send me a list of the user emails (privately) with active authorisations so I can let them know.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jsn.sherman changed the task status from Open to In Progress.Jan 30 2024, 5:20 PM
jsn.sherman changed the task status from In Progress to Stalled.Jan 30 2024, 5:43 PM

I requested assistance from our technical contact. Marking as stalled until we hear back.

Do we at least have an explanation of why this happens (repeatedly).

Is there any chance we could go back to the old mehod, where we simply had accounts recognized on teh standard newspapers.com site, especially since that made using auoated citations significantly easier?

jsn.sherman changed the task status from Stalled to In Progress.Jan 30 2024, 11:11 PM
jsn.sherman moved this task from In Progress to QA on the Moderator-Tools-Team (Kanban) board.

These errors should stop showing up for everyone within the hour. Apparently there was activity on the OCLC network that tripped security rules for the site. OCLC hosts our web proxy, so when they get blocked, we get blocked.

I finally remembered to test it and it worked.

I don't know if this is a related problem or should be filed as a separate ticket, but there are a very small number of papers which cannot be accessed through our library bundle. When you try to visit them, it gives you the same "Upgrade to a Publisher Extra Subscription to view this page" blocking pop-up that you would receive if you didn't have a subscription at all. Some examples I have encountered: https://www-newspapers-com.wikipedialibrary.idm.oclc.org/image/864192942/ https://www-newspapers-com.wikipedialibrary.idm.oclc.org/image/548374608/ I assume this is an error and Newspapers.com is not deliberately restricting these from us?

To clarify, 99.9% of Publisher Extra content still works totally fine, at least for me right this moment. It's just a scattering of these few papers that don't work, like the ones I linked. I think it applies to the entire run of a given paper, so we can't access any of the Lima News or the Brattleboro Reformer. It's a small issue, I just don't know who else to raise it with.

Hmmm… can't reach this pagewikipedialibrary.idm.oclc.org refused to connect.
Try:

Checking the connection
Checking the proxy and the firewall
ERR_CONNECTION_REFUSED

I also can't connect to any Wikipedia Library OCLC service right now (Firefox says "Unable to connect - An error occurred during a connection to www-newspapers-com.wikipedialibrary.idm.oclc.org."). The problem persists across different devices, and started sometime early this afternoon. I work for a branch of the University of Wisconsin and have heard a lot in recent weeks about transitory OCLC outages affecting the university's services, so I suspect this could be another manifestation of that, rather than anything on Wikimedia's end.

It looks like OCLC is performing maintenance on their servers, per https://oclc.service-now.com/status

I also can't connect to any Wikipedia Library OCLC service right now (Firefox says "Unable to connect - An error occurred during a connection to www-newspapers-com.wikipedialibrary.idm.oclc.org."). The problem persists across different devices, and started sometime early this afternoon. I work for a branch of the University of Wisconsin and have heard a lot in recent weeks about transitory OCLC outages affecting the university's services, so I suspect this could be another manifestation of that, rather than anything on Wikimedia's end.

This seems to be an ongoing issue beyond maintenance - see T365553 for updates.

I could have sworn I posted here this morning.

Seems to be working now, though.

I've been unable to log into Newspapers.com through Ancestry using my Wikipedia Library account for weeks now. I can log into my Library account just fine. When, in Newspapers.com, I try to create a clipping to cite in a Wikipedia article, the Newspapers.com signin popup appears. I click "Ancestry" and the popup for name and password appears. I enter those. That popup disappears but the signin popup remains. Clicking the "Ancestry" button again makes it disappear and instantly reappear. When I go to Newspapers.com, I am logged in there, but I don't have Publisher's Extra.

Is there any expectation of this being fixed? It really impairs our ability to present information responsibly on Wikipedia.

Thanks so much for all the hard work you do that we never see!

@Oona_Wikiwalker, it might not be terribly convenient, but have you tried creating a Newspapers.com account entirely separate from your Ancestry account and using that? You should be able to use the same email address for both. It works for me, at least. That may be the best workaround until or unless that integration is fixed.

Per my note in T366070, we are considering moving Newspapers.com off proxy configuration back over to the manual account setup and renew method we had before this. Unlike most of the proxy configurations in the library, this one is a manually configured proxy which took us some time to set up and is very fragile to breaking. It's a big time sink for us to continue maintaining and fixing it.

For those who never had an individual account for Newspapers.com, the process worked like this: You would need to register an account for Newspapers.com (if you don't have one already), then file an application with us via the library to get your account upgraded. This would last 12 months, at which point you would need to request a renewal (we send automated reminders). This means you would access Newspapers.com directly on their website, rather than going via the library - it would be easier to directly cite Newspapers.com content without unproxying the URLs, and the website would behave normally without any proxy weirdness like Ancestry logins not working. The downside, of course, is the application-and-renewal process, which can result in some access downtime, and requires a little extra effort. Please let me know what you think.

I personally preferred the old way, even ignoring the issue of reliability, just because you could copy and paste URLs directly off the page without having to manually turn "https://www-newspapers-com.wikipedialibrary.idm.oclc.org" back into "https://www.newspapers.com" so ordinary users could access the article. That being said, if we do go back to the old way, I would hope that the application process could be streamlined somehow. It was always a little nerve-wracking to wait for weeks to get re-approved, and sit there wondering "are they looking at my edit history? have I been a good enough Wikipedian to get renewed?"

I use the Zotero browser plugin, which can automatically proxy and un-proxy links, so I've not had a problem dealing with proxied links.

It was always a little nerve-wracking to wait for weeks to get re-approved, and sit there wondering "are they looking at my edit history? have I been a good enough Wikipedian to get renewed?"

I never previously applied for this reason.

I'm currently getting an "Upgrade to a Publisher Extra Subscription to view this page" message on all pages. Just thought I should mention it.

I've never minded unproxying the links. It's no more annoying than having to wait around while Archive.Today creates its saved page so I can grab the long link. Is continually renewing accounts' access to newspapers.com less work than dealing with the fragility of the manual proxy? I don't want to be unreasonable.

As for how many of us access Newspapers.com, I know I do it all the time, because I don't put in articles what isn't supported by sources. Since I haven't had access, I've been hunting down geographic coordinates instead. However, tonight I'm trying to fix the reference section of someone's MONSTER of an article that doesn't link its references. There are over a hundred of them and many are newspapers.

So. I vote for whatever whatever lets me fix stuff because there is so much in Wikipedia that's below par!

Thank you so much for your efforts. You must know you make a difference, but I know it too. (I've been "suffering" without my Newspapers access, lol! I guess Wikipedia does too.) There have to be others like me, they just don't know about Phabricator.

I'm currently getting an "Upgrade to a Publisher Extra Subscription to view this page" message on all pages. Just thought I should mention it.

The site works fine in Microsoft Edge, but fails to recognize the institutional access we have in Firefox, from what I can tell after testing multiple devices.

Newspapers.com is behaving like above in Firefox, Chrome or the Brave browsers I use. I don't really use Microsoft Edge...

As a test, I tried spoofing the Firefox user-agent claiming to be Edge and it didn't help. So it's something about the actual functionality of the different browsers that is creating the different behavior.

Thank you for looking into that. The reason I don't just get another Newspapers.com account is that I've learned that you can never plan what's going to happen. Twice in the last four years, we here at Casa Oona were just living our lives when one of us suddenly ended up in the hospital for weeks. I was there when not at work because disabilities the people had were more than the nursing staff could handle with their workload. For a good while after coming home, I did not have had the time, energy, or available mental acreage to hunt down how to avoid being migrated from a free trial to a paid subscription either. So I don't know what to do...

If this helps anyone, I have found that the following method resolves this issue with Firefox for Android.

  1. Log out of Wikipedia.
  1. Close any open tabs from Newspapers.com
  1. Delete all browser cookies.
  1. Log in to Wikipedia.
  1. Login to Wikipedia library (and select Newspapers.com).

Confirming that Edwin of Northumbria's suggestion worked for me using Firefox on Mac. In my case, I deleted only cookies for Newspapers.com.

Just tried this and it worked. All I did was delete the cookies, no logging out or closing tabs.

I'm grateful for the suggestions, but I don't use Firefox on my phone. I tried downloading it to my laptop and no luck with Newspapers.com.
I do regularly clear cookies from my browsers and I still can't log into Newspapers.com with my Ancestry account. I understand this is a lot of work and it's probably beyond annoying. I've been trying to avoid it by, as I said, hunting down coordinates. I've worked through the backlogs for California, Arizona, Hawaii and Florida. I'm halfway through Texas. But I came across a page, Raid on Norias Ranch, with so many inaccuracies that I had to rewrite it. A person central to the event was replaced by a fictional person. People said to have been killed were not. The date was wrong, another person was misnamed, etc., etc.

One of the main sources I relied on for the rewrite was a newspaper article containing the account of one of the participants. Only I can't post the clipping because of our issue here. I've had to resort to posting this as a source and telling people to search the source code for the account. (Search the page (not the code) for "Gay" to see why. You're looking for a man named D. P. Gay.) I don't even mind posting subscription-requiring links to newspapers.com for the obituaries I cite, but the issue with this major source is just too much.

I know fixing this is a lot of work and a royal pain. Our late-stage capitalism takes its toll on being able to find useable sources and public domain files and we editors who try to do solid work have to fight harder for that. I can't be the only one dealing with this. I'm guessing a lot of editors don't know about Phabricator (I only found it because finding things is what I do, lol) and they just post unsourced content in articles.

We depend on you so much. What you do makes an enormous difference, even though you probably get no thanks or notice. I'm sorry you don't. But I thank you.

@Oona_Wikiwalker, I'm not currently having any issues viewing and clipping via the integration so see this clip. Oddly, when I try to view the oclc integrated version of the clip i.e. this one I get "This page is not available This page is pending publication, has been removed or moved to a new location". (While I was there... clip 2 clip 3)

Also not sure if it's an oclc integration issue or a newspapers.com issue but the search keeps forgetting the location when adding date ranges which means every search change takes two attempts.

@KylieTastic , are you trying to log in to Newspapers.com through Ancestry.com?
And those are FANTASTIC articles! However, the Wikipedia article is about the Norias Division raid, and this is a very sore issue in Texas. I've been trying to keep the scope of the article strictly to the raid and save the politics surrounding the issue for an article on that subject. The article I had to use that silly workaround for is this one: </br>https://www-newspapers-com.wikipedialibrary.idm.oclc.org/image/286550494/?match=1&terms=%22george%20j.%20head%22

And I'm really grateful to you for those clips, but I can't go around imposing on others to clip articles for me. It's just not fair to them, including you.

@KylieTastic , are you trying to log in to Newspapers.com through Ancestry.com?

Ah no, I have a newspaper.com account from before they changed to the oclc integration

I've always ever had an Ancestry account there, from when they first began offering them.

I'm using Chrome on a Macbook. On June 22 I accessed the resource as usual,
searched and clipped several articles. On June 23 and since
when I click the button to access the collection, I get to a welcome
page where I am invited to try 7 days free and upgrade by paying for a
subscription. I am logged in as HazelAB and the account details tell
me I'm a Registered Guest and need to pay for a subscription. I can
see a list of the clippings I saved but when I click on any
I get a brief glimpse of the clipping, and then a screen appears with
the message "This page is not available This page is pending
publication, has been removed or moved to a new location"

I've had similar problems in the past few months, but they've not lasted as long.

Maybe reverting to the old application/renewal system would be preferable.

I am also now a Registered Guest. When I do a search for anything I get the following:

"There was a problem loading the search results. Please try again.

If the problem persists, please contact us."

Having the same issues as Gamaliel and HazelAB. It is clear that the proxy implementation is so rough that it is degrading lots of site functionality. Search only works outside the proxy.

I was having the same search problem earlier, but it went away after a little while. Now it's back again

jsn.sherman changed the task status from In Progress to Stalled.Sun, Jun 30, 5:35 PM
jsn.sherman removed jsn.sherman as the assignee of this task.

Can't log in to an account or search. "There was an error while loading the form. Please contact customer support."
{F56212282}

Under macOS and under both Firefox and Chrome. I can search newspapers.com for articles. However, I can't log in to clip articles.

I tossed all my cookies (boy did that feel good!) and rebooted because it had been awhile.

When I try to log in using email to newspapers.com, I get "There was an error while loading the form. Please contact customer support"

When I try to recover my newspapers.com password I never get the email.

When I try to create a new newspapers.com account, I get "There was an error while loading the form. Please contact customer support"

Maybe we should go back to the old system instead of the proxy.

Can't log in to an account or search. "There was an error while loading the form. Please contact customer support."
{F56212282}

I'm getting the exact same error every time I try to log in: "There was an error while loading the form. Please contact customer support."

Strangely enough, login works perfectly fine when I try to access Newspapers.com outside of the proxy! I am actually currently logged into my account that way – but because I don't have a paid subscription due to using it through the Wikipedia Library, I can't access the articles I need to clip. I've tried everything I usually do when I encounter similar problems (logging in/out, deleting cookies, clearing cache, etc.), but nothing has worked.

I'm at least relieved to know it isn't a problem on my end, but it is still very frustrating. Doubly so because I've had a lot more technical issues with the proxy system than I did with the old method.

If it helps, I'm using fully up-to-date Firefox on a MacBook Pro, currently on macOS 14.1.2 (23B92).

Same here. It doesn't seem to be possible to log in at all, since all three login methods are not functioning now. This applies both to Firefox and to Microsoft Edge, which had previously been an effective workaround to other issues.

My account randomly signed me out, so count me in.

I feel like a lot of the errors are "autoimmune": forms, scripts, etc. checking for something and seeing unexpected results from resolving to another domain. In other words, Newspapers.com is so not built to run in a ezproxy setting that functionality is crippled.

Add me to the chorus of people who want the old system back ASAP. To be blunt, the ezproxy is a failure and needs to be scrapped; it is disappointingly clear that it cannot work at all on Newspapers and is impairing my efforts on Wikipedia substantially. Extremely disappointed.

Everything was working for me up until few hours ago. I was able to sign in to an account as recently as a week ago, and then I was just signed out within the last day and now I can't access anything.

Things seem to have gotten worse: until this morning, I could still view any page without being logged in, I just couldn't make clippings. Now when I'm in the proxy system, I get the same "You need a subscription to view this page; Start a 7-Day Free Trial" message that one would get if they had no access at all.

I wonder if the login problem might be related to Cloudflare, since when I log in to the non-proxy newspapers.com, there's a "Verify you are human" checkbox. But the fact that we no longer have the benefits of non-logged-in access either suggests that there's more to it than that.

I can't make a clipping either. It tells me to log in. When I do (from newspapers.com) and I go back to the page, it tells me to log in.

Also, if I try to log in by clicking on the link, it tells me there was an error and to contact support.

Count me affected, too. I like the EZProxy setup better than the previous logins, but Newspapers.com has some work to do to make this work smoothly. I also can't imagine that Wikipedia is the only group with institutional access to Newspapers.com affected by these issues.

Things seem to have gotten worse: until this morning, I could still view any page without being logged in, I just couldn't make clippings. Now when I'm in the proxy system, I get the same "You need a subscription to view this page; Start a 7-Day Free Trial" message that one would get if they had no access at all.

I wonder if the login problem might be related to Cloudflare, since when I log in to the non-proxy newspapers.com, there's a "Verify you are human" checkbox. But the fact that we no longer have the benefits of non-logged-in access either suggests that there's more to it than that.

We are starting to get somewhere in diagnosing this issue. The error box appears after a cookie for challenges.cloudflare.com is added.

I can't explain the clipping page issue. It seems to be involving the NextJS JavaScript framework they use or maybe React. Someone with more knowledge of those components might have a better shot.

I think the clipping page issue is one and the same with the login issue, because you need to be logged in before you can create a clipping.

The logging out, BTW, is not universally applied; for some reason, I am still logged in to the NP ezproxy on one laptop even after my primary and secondary laptops bounced me out.

That said, I am still in favor of removing the ezproxy entirely and going back to the old system.

I think the clipping page issue is one and the same with the login issue, because you need to be logged in before you can create a clipping.

That makes no sense because clippings are publicly viewable. It's like it's tripping an error page for no good reason. I do think they are related and may even have the same root cause (I didn't see any Cloudflare cookie generation issues), but they are not identical.

@SammiBrie: Sorry, I thought you meant the issue with creating clippings, not an issue with viewing them. I have experienced the problem with viewing clippings as well, but in my experience it's resolved by just refreshing the page a couple times until it works. I frankly didn't realize that was a problem specific to the proxy, I thought it was just Newspapers.com itself glitching out.

I can't see search results regardless of whether I am logged in.

I sign into newspapers.com but when I try to create a clipping it tells me to sign in.

I use Newspapers.com through the proxy but also log into a subscriptionless personal account simultaneously in order to clip articles. This has mostly worked except for some hiccups back in June, which I resolved by clearing my cookies for both newspapers.com and www-newspapers-com.wikipedialibrary.idm.oclc.org. Unfortunately, today I got logged out and clearing cookies did not help. As others noted, when I tried to log in directly through newspapers.com, I saw a CloudFlare challenge as part of the login form. The proxied login form also attempts to display this challenge, but the XMLHttpRequest to https://challenges.cloudflare.com/cdn-cgi/challenge-platform/h/b/flow/ov1/… fails with an HTTP 400 error. In the console, I see the following error and warning:

turnstile onError 110200
[Cloudflare Turnstile] Error: 110200.

According to this documentation, error 110200 means that Newspapers.com has not configured CloudFlare to accept the domain www-newspapers-com.wikipedialibrary.idm.oclc.org; they need to add it to the CloudFlare dashboard. Is this something we can get them to do?

(I also see a 400 error from https://cdn.privacy-mgmt.com/mms/v2/get_site_data, but this appears to be related to showing a GDPR notice and probably doesn’t explain the failure.)

How can you log in, mxn? I've been seeing Newspapers.com flagged as unavailable in the Library for days now...

I meant that I went directly to www-newspapers-com.wikipedialibrary.idm.oclc.org and clicked the Login button, which I would normally do to connect the Wikipedia Library access to my personal account. But this no longer works because of the CloudFlare error I described.

As I've said multiple times in this thread, that doesn't work for me. My account is through Ancestry. I can log in from newspapers.com itself, but I can't clip from the library access. And I've barely been able to add content to Wikipedia for nearly three months because I don't put in what I can't verify. Basically, my only sources are books from Internet Archive and websites.

I wonder if anyone has told Newspapers.com that they probably get more exposure (free advertising) from people checking Wikipedia sources than from people doing genealogy? I've done all my genealogy, back to the 1500s and I never used newspapers.com. However, I bounce there all the time looking at sources in wikipedia articles. I always check what people tell me, lol.

To be clear, I’m not saying I have a solution or even a workaround. I’m explaining what I did to narrow down the issue to that CloudFlare error, in the hope that someone can nudge Newspapers.com to adjust their CloudFlare settings.

Just now, I returned to www-newspapers-com.wikipedialibrary.idm.oclc.org and found that the “Welcome from Wikimedia Foundation” banner had returned and I can once again search the archives. However, the CloudFlare error on the login page still prevents me from logging in and creating clippings. So we aren’t out of the woods yet, but if others are seeing the same improvement, then I suppose TWL can reenable the Access Collection button.

Hello!

Thanks for your comments. I just wanted to let you know that we are aware of the ongoing issues and and we are working on identifying a solution. Thanks for your patience on this.

I can't sign in to create clippings.