Page 1 of 1 1
Topic Options
#211035 - 2016-01-21 08:21 PM Periodic "ingroup" failures
JNK Offline
Fresh Scripter

Registered: 2006-04-11
Posts: 33
Loc: USA
I am chasing a nagging problem in our environment and I am hoping the smart folks in this forum can offer some advice.

Our Logon Script has a block of code that sets a number of User variables based on the user's AD Group membership. This code has been running [mostly] flawlessly for a long time (15+ years). A few months back, we started seeing errors for some users when running the INGROUP function -- which caused a cascade of errors when running other Logon Script modules.

Skipping all of the gory troubleshooting bits, we discovered the Kixtart TokenCache (hkcu\software\kixtart\tokencache) is truncated on machines throwing the error. Actually, it appears to quit after the Machine (local) group membership; none of the Domain groups (Local, Global or Universal) appear in the Token Cache. I also noted machines with truncated token cache are missing the CacheAge value.

The command line that runs the Logon Script already contains the /F parameter to flush the cache before each run, so we should be OK.

We have a variety of manifestations of the error:

- The error does not occur during every logon for some users
- The error does not happen - for everyone
- The error happens on every logon for some users

I can find no common denominator for users consistently getting the error -- or any other common denominator for that matter.

After the error happens, if I manually run the Logon Script, the error may or may not happen again (50/50).

The error does not appear to follow the user to another machine. A different user logging onto a machine that exhibited the error previously may or may not have a problem. It usually does not for new user profiles.

My environment looks like this:

- Windows client-server domain with two-way (in and out) Trusts to five (5) other domains
- Forest functional level: 2003
- Domain functional level: 2008 R2
- Domain Controllers: 57 DCs in 53 sites; Server 2008 R2
- Clients: Windows 7, SP1, 32-bit
- We are [mostly] current on patches
- Kixtart version: 4.62.

I have already read through some of the other posts on this topic (http://www.kixtart.org/forums/ubbthreads.php?ubb=showflat&Number=196497 and http://www.kixtart.org/forums/ubbthreads.php?ubb=showflat&Number=193933#Post193933). I tried installing MS KB262958 (it is "not applicable" to our systems). I also looked at http://support.microsoft.com/kb/976494 and the related Hotfix ... no joy.

Deleting and recreating the User Profile does not fix the problem either.

Over the past few days, I "fixed" about a dozen machine by running "wkix32.exe /F" by itself, confirming the token cache was removed from the registry (hkcu\software\kixtart\tokencache), running a script with a single INGROUP function in it ($_ = ingroup("domain users")), confirming the token cache was rebuilt successfully (including the CacheAge value) and then restarting the machine. Problem solved, right? i would like to think so and most of the machine are still OK but I have 3 machine that are broken again today.

I'm pulling my hair out here! Any insight / assistance will be most-appreciated.

~Jim

Top
#211036 - 2016-01-22 12:08 AM Re: Periodic "ingroup" failures [Re: JNK]
Allen Administrator Offline
KiX Supporter
*****

Registered: 2003-04-19
Posts: 4535
Loc: USA
Any eventviewer error messages regarding kixtart?

Have you removed your AV/Security Suite to see if it is interfering?

Here is a thread similar as yours, but sadly no resolution.
http://www.kixtart.org/forums/ubbthreads.php?ubb=showflat&Number=118172#Post118172

Have you checked your Replication between the DCs?

Top
#211037 - 2016-01-22 05:34 PM Re: Periodic "ingroup" failures [Re: Allen]
JNK Offline
Fresh Scripter

Registered: 2006-04-11
Posts: 33
Loc: USA
There is one entry in the Event Log related to Kixtart; ID 1789: Failed to resolve SID(s) Error : The trust relationship between this workstation and the primary domain failed. (0x6fd/1789).

An nltest /sc_verify:domain comes back clean (0x00000000 success) and the machine functions normally (no errors when accessing network resources, printing, etc.), so this Event is bogus.

The error happens on machines without any A/V. If it were A/V, all 3000+ nodes would be throwing this error and that is not the case.

Replication between DCs is fine. Again, if it were an issue, node in an affected site site would be throwing the error.

I will have a look at http://www.kixtart.org/forums/ubbthreads.php?ubb=showflat&Number=118172#Post118172

Top
#211038 - 2016-01-22 05:37 PM Re: Periodic "ingroup" failures [Re: JNK]
JNK Offline
Fresh Scripter

Registered: 2006-04-11
Posts: 33
Loc: USA
I stumbled upon http://www.kixtart.org/forums/ubbthreads.php?ubb=showflat&Number=118172#Post118172 earlier ... stopped reading after I saw "NT4".
Top
#211039 - 2016-01-22 10:39 PM Re: Periodic "ingroup" failures [Re: JNK]
Glenn Barnas Administrator Offline
KiX Supporter
*****

Registered: 2003-01-28
Posts: 4372
Loc: New Jersey
Hmm - what's wrong with "NT4"? \:\)

I worked with a client a while back who was having strange and intermittent issues related to authentication and connecting to the domain during the login process (represented as authorized drive mapping failures). I modified the login script to do a full environment dump when we could not reproduce their issue/symptoms. What we found was that when the domain connection failed, the corresponding environment values were undefined. In some cases the entire environment was blank! (yeah - open a command prompt, type SET, get no output - if you could do that during logon). The thing that made this tough to troubleshoot is that if you dumped the environment after the login process completed, it was fine - it was just incompletely defined during the initial login process.

I can't provide any actual resolution because only 4 his 150+ systems experienced this, and after he re-loaded one and the problem went away, he simply re-imaged the other 3. The key was seeing that the environment was messed up during the logon process, but not after.

We actually moved away from InGroup for group authorization in our login script because of timing and performance issues in large (15-20K+ users) organizations and/or those with lots of security groups. Our script caches the list of groups the user is a member of when the script starts, eliminating lots of InGroup lookups.

You can find more about our Universal Login Script from the User Guide or the product page.

Glenn
_________________________
Actually I am a Rocket Scientist! \:D

Top
#211048 - 2016-01-27 03:08 PM Re: Periodic "ingroup" failures [Re: Glenn Barnas]
JNK Offline
Fresh Scripter

Registered: 2006-04-11
Posts: 33
Loc: USA
Yep, I have an "audit.kix" script that runs when the error happens to dump the network, environment and a number of other bits out to a text file. That text file is then emailed directly to me. The Environment is fine. The output of "net user %username% /domain also shows the user is a member of the group reference in the "ingroup" function. In other words, on the surface, everything looks as it should be. Again, the problem appears to be with the truncated token cache in the registry.

My solution, at this point, is to remotely hack the user's registry to clear out the existing token cache and then reach out to my local techs to have the user run the Logon Script at their convenience.

reg query \\computer_name\hku
- this shows me a list of user hives/guids; I look for one with *_classes and copy it (sans _classes)

e.g., HKEY_USERS\S-1-5-80-1234567891-2345678912-3456789123-4567891234-5678912345

reg delete \\computer_name\hku\S-1-5-80-1234567891-2345678912-3456789123-4567891234-5678912345\software\kixtart\tokencache /va /f

In all but one case after running this "fix", the problem has not recurred. The burning questions remains though ...

1) What caused the problem in the first place?
2) Why doesn't the problem affect everyone? Only a small percentage (less than 1%) of my users are throwing this error.

Top
#211054 - 2016-02-01 04:20 PM Re: Periodic "ingroup" failures [Re: JNK]
JNK Offline
Fresh Scripter

Registered: 2006-04-11
Posts: 33
Loc: USA
I wanted to provide a quick update on this issue.

After reviewing the results of the audit script, we have stumbled upon what we _think_ is a possible cause of the truncated Token Cache. It appears the users throwing this error are members of one or more AD group containing references to Foreign Security Principals (FSPs). More specifically, the users throwing the error appear twice; once with their "real" migrated account and once with their FSP from the Trusted Domain.

We are still working through the literally thousands are groups to remove these references but early testing is showing a positive result.

... and of course, the person responsible for this mess is no longer with us to help clean up.

#bad_words

All indications point to the use of the Quest Tools to migrate accounts as the cause.

Also, cleaning up the group membership is step one. I still have to hack the registry and manually kill the Kixtart Token Cache. Running Kix with /F does not flush the cache on a machine that is broken.

Top
#211055 - 2016-02-01 04:30 PM Re: Periodic "ingroup" failures [Re: JNK]
Allen Administrator Offline
KiX Supporter
*****

Registered: 2003-04-19
Posts: 4535
Loc: USA
Thank you for posting your findings. Please update us as you go.
Top
#211239 - 2016-03-25 03:36 PM Re: Periodic "ingroup" failures [Re: Allen]
JNK Offline
Fresh Scripter

Registered: 2006-04-11
Posts: 33
Loc: USA
Quick update ... still no resolution for this despite best efforts. Thanks for all of the replies on this but at this point, I think we are just going to have to live with a certain percentage of our population throwing an error at logon. Once someone "important" gets the error and complains, it will get attention from the group that can actually dig into the problem at the bit level. Thanks!
Top
Page 1 of 1 1


Moderator:  Jochen, Allen, Radimus, Glenn Barnas, ShaneEP, Ruud van Velsen, Arend_, Mart 
Hop to:
Shout Box

Who's Online
0 registered and 96 anonymous users online.
Newest Members
Jstepput, jtpk2022, Rayvenhaus, Insecurity, KGSOFT
17792 Registered Users

Generated in 0.064 seconds in which 0.025 seconds were spent on a total of 13 queries. Zlib compression enabled.

Search the board with:
superb Board Search
or try with google:
Google
Web kixtart.org