So basically just do what I did in 3.x, only use the actual specific number of successes versus failures to determine outcome, rather than just winging it?
Winging it makes it mostly a check against the capriciousness of the DM, with streaks of autowin mixed with a couple of 'Oh, you rolled a 1, this bad thing happened' and a lot less firm thinking of - if they succeed, this cool story option happens. If they fail, this other cool story option happens. Sometimes it works great with the DM, sometimes it doesn't, and codifying things along so they don't take too long, give out xp, have pre-defined good and bad results, is a good thing.
Some groups dig knowing they're in the challenge and working through it in various ways (Stalker's Obsidian is good for groups like that) while others don't want the immersion broken. For those, I'd suggest the invisible skill challenge. They still end up getting extra xp and their skills still matter.
Just remember that when lots of people roll, turn checks into group checks where a majority need to pass or something similar, not 'Okay, I guess that's 3 successes and 2 failures for the same check'
Regardless of what you do, lower complexities -or- high complexities over an extended time are probably the way to go. 12 skill checks in a row can throw people off, but 5 here, combat, 5 there, rest or 2 a day over a week in a city, etc. Those all are less glaring.