Question

Locked

AD clients NTuser.dat locked on login

By keith.abbott ·
Tags: Off Topic
Hi,

We're having problems with some of the users on our AD domain. They get userenv 1508 errors (among others) when they try to log in the first time in the morning.

Some background:
Our original domain is an nt domain. All users on the AD domain (except 1 or 2) were migrated from the NT domain to the AD domain using microsoft's ad migration tool.

The AD domain is win2k3 r2 (upgraded to r2), the clients are xp.

We were migrating users from the NT domain in a staged fashion and some had been working fine for a couple of months.

One Friday night I made some GPO changes to try to correct some time sync problems, and get the PDCE to sync with an external source.

The following Monday morning there were a rash of issues with people getting warnings that their profiles couldnt be loaded and temp profiles were being created.

It would be an unbelievable coincidence if the two events were not related (I dont believe it).

Most were able to reboot and get loaded correctly but several profile rebuilds were required.

I backed out the changes I had made previously but the problems persisted in the following days. The user could log in and out all day long with no problem, but overnight the issue was created.

At suggestions from another site, I tried GPOfix to return the group policy back to default (which was fine because previously the default group policy had been modified and this gave us the chance to default it and, instead create our own adjunct policies). However the problem persists.

About 150 of somewhat over 300 users have been migrated. Of the 150, I'd estimate about 50 of those have experienced the problem. Maybe 6 or 8 experience the problem very consistantly. Others come and go. Some seldom have the problem, many have not yet experienced it. We had 11 users reboot yesterday, 16 the day before. The 1 or 2 that were built on the AD domain have not yet had a problem.

Using Sysinternals procmon and process explorer we have determined the cause of the problem (at least in the case of our test subject) is that system (PID 4) is locking the profile's NTuser.dat.

The problem is profile specific as we had our test user attempt to log in and when it failed, log off and attempt to log in under a different account. The attempt was successful.

For those who have daily problems, rebuilding their profiles usually give good results, but not always permanent.

We have tried installing user profile hive cleaner. Again this had some but not universal success. (and I consider it a band-aid). It worked fine for some of our sample group, but the main test user had little or no improvement.

By logging in under another account, and manually loading the users profile we have been able to determine that, in the case of the test user, the event causing the issue occurs betwen 20:00 and 20:45. The user logs out about 15:30 each day but I don't think the event is related to the logout time (I could be wrong).

I think those are the main facts (there are certainly a hundred others).

Does anyone have any idea how we might approach finding a solution to these problems?

Thanks for your help.
k

This conversation is currently closed to new comments.

27 total posts (Page 2 of 3)   Prev   01 | 02 | 03   Next
Thread display: Collapse - | Expand +

All Answers

Collapse -

standard set of userenv errors

by keith.abbott In reply to event viewer

Actually as I look, a few additional errors have started appearing at some point. In chronological order, the following application errors occur (Note the user logs in around 0630):
0033 userenv 1054
Windows cannot obtain the domain controller name for your computer network. (A socket operation was attempted to an unreachable host. ). Group Policy processing aborted.

0226 userenv 1054
0628 userenv 1508
Windows was unable to load the registry. This is often caused by insufficient memory or insufficient security rights.

DETAIL - The process cannot access the file because it is being used by another process. for C:\Documents and Settings\T00954\ntuser.dat


2 sec later userenv 1502
Windows cannot load the locally stored profile. Possible causes of this error include insufficient security rights or a corrupt local profile. If this problem persists, contact your network administrator.

DETAIL - The process cannot access the file because it is being used by another process.


<1sec later userenv 1515
Windows has backed up this user's profile. Windows will automatically try to use the backed up profile the next time this user logs on.


<1sec later userenv 1511

Windows cannot find the local profile and is logging you on with a temporary profile. Changes you make to this profile will be lost when you log off.


The 1054's are 'new'. Please note that I have tested the following.

Flush the user's dns cache
ping PDCE - resolves fine
Ping DC (secondary DNS) - resolves fine
ping PDCE.Domain.com - resolves fine
ping DC.Domain.com - resolves fine

OK I've just located an issue with DNS resolution - but since it only involves pinging the old NT domain servers, it may be completely unrelated. I'm afraid I'll have to include some background GC has already heard in a different thread:

Our original NT domain had it's own dns at x.x.x.30 . When we created the new AD domain, naturally the first DC had DNS - it was at x.x.x.48.

Later on, we created a new DNS server on the AD domain on a DC - decommissioned the DNS server on the NT domain, and moved the IP address - x.x.x.30 - of the old NT dns server, to the DC. We bound it's DNS to x.x.x.30, and disabled it's old nic so x.x.x.30 became it's only address.

So, and as it stands today, all AD domain users point to x.x.x.48 as the primary dns and x.x.x.30 as the secondary dns. For NT domain users it's just backwards of that.

I set up a packet capture on our DNS servers and had the AD test user attempt to ping the DC from the NT domain. The suffix search order for an AD user is 1.ADdom.com, 2. NTdom.com, 3. RemoteSiteDOM.com.

The remote site point's up to our corp offices: it is also where x.x.x.48 DNS forwards for internet resolution.

Viewing the packet capture, I expected to see a request to resolve NTPDC.ADDOM.COM - which should fail, then NTPDC.NTDOM.COM - which should succeed. Instead, NTPDC.NTDOM.COM failed and the request was made for NTPDC.RemoteSiteDOM.com. It was correctly resloved by x.x.x.48 but with the wrong suffix in the return. I have no idea why.

Further, it doesn't seem like this should come into play at all with the issue at hand (profile locked because ntuser.dat is in use) so it may just be a side issue I need to investigate.

PS, as a side note, I repeated the test of having the test user ping the old ntPDC only this time via fqdn (ntpdc.nddom.com) and x.x.x.48 correctly resolved the name. I don't understand why it didnt when it was an appended suffix!

Collapse -

DNS suffix list not supported???

by keith.abbott In reply to standard set of userenv e ...

This doesnt make sense. I just found the following statement on support.microsoft.com:

The following methods of distribution are not available for pushing the domain suffix search list to DNS clients:
Dynamic Host Configuration Protocol (DHCP). You cannot configure DHCP to send out a domain suffix search list. This is currently not supported by the Microsoft DHCP server.


This is listed for NT4/win2k. Our DHCP server is Win2K server. Is this correct?

Collapse -

yes it's true in that you can't "push" a search list

by CG IT In reply to DNS suffix list not suppo ...

which is to say override a static list or a list from options configured in DHCP. Clients that use DHCP to get it's configuration eg DNS servers does so from options configured in DHCP. you can't specify a list to use other than those you list in the options.

The list in options is what the client will use in decending order. If the first DNS server is unavailble it will use the second one. If that isn't available, then the 3rd one. If the workstation can not find a DNS server that can answer the query, which in this case authenticate to a domain controller for the domain, you get that can not contact the DC to authenticate with. If the user has a local machine profile, that is the same account for the domain profile, then that loads, but Group Policy from the domain is aborted and the local machine group policy, if any, is loaded.

Collapse -

ok, Microsoft seemed to say something else"

by keith.abbott In reply to yes it's true in that you ...

"you can't specify a list to use other than those you list in the options"

OK, Microsoft seemed to be saying that you could not provide a list of DNS suffix search order to clients via DHCP (option 109). But you say they really just mean that you cant OVERRIDE the Local list?

That would be fine. It just looked like you couldn't provide the suffix list at all via DHCP from their statement. In that case, what would be the point of even including the 109 option!

Also as I was going through the UserENV logs for our test user, I noticed several

"No GPO changes but couldn't read extension (xyz)'s status or policy time."

They didn't appear to be in any place significant, but I'm not sure if their presence is significant or not.

k

Collapse -

personal opinion, you really don't want to do this

by CG IT In reply to yes it's true in that you ...

using option 119.

I've not seen any large company doing this and changing the object names via DHCP option by appending suffixes I think your just asking for trouble.

What option 119 [and honestly I had to re-read the RFC again] does is change the computer name

example: NetBIOS name. Active Directory Domain name is changed.

For Group Policy [and I had to look this up] from Technet:

" By default, the primary DNS suffix portion of a computer's FQDN is the same as the name of the Active Directory domain to which the computer is joined. To allow different primary DNS suffixes, a domain administrator can create a restricted list of allowed suffixes by modifying the msDS-AllowedDNSSuffixes attribute in the domain object container. This attribute is managed by the domain administrator using Active Directory Service Interfaces (ADSI) or the Lightweight Directory Access Protocol (LDAP)".


I keep adding comments:

I just called up a collegue and asked him if he ever did this [works at HP] and he said they used to do this when they wanted to quickly move a large batch of computers to a different domain during weekend moves. What he said was, "never worked right well as you know".

Collapse -

what is a better way?

by keith.abbott In reply to yes it's true in that you ...

OK so how do you handle it when you have applications that only refer to the computer name but are on domains different from the calling computer (eg application on computer 'itsme.mydomain.com' refers to datasource at thatserver.hisdomain.com' only as 'thatserver', other than suffix search order?

We obviously have this situation in the extreme since we currently have 2 domains sharing the same address space (until we complete the conversion - which we can't complete until we resolve the profile issue). Even then, we will still have the issue to a lesser degree.

thx,
k

Collapse -

Better way - if the application is a custom one eg in house,

by CG IT In reply to yes it's true in that you ...

reference your info: "applications that only refer to the computer name but are on domains different from the calling computer (eg application on computer 'itsme.mydomain.com' refers to datasource at thatserver.hisdomain.com' only as 'thatserver', other than suffix search order?

Without knowing the application and how it was created [custom/standardized/open source]
and why it can not work with a new data source, tough to suggest a better way.

Moving the data source to a different domain shouldn't really be a problem with an application as long as the application is configured to use the new data source. and by that, I mean the application knows that the data source is at a particular path/location. Again, how the application does this if it's coded in or you have to specify the path during setup or changing the path...eh only the guys who made it can tell you that...

Keeping permissions, rights, might be a big problem when you move the data source as rights and permissions configured on the data source would be for an different domain.

Tough to say...

Collapse -

No Control over the guilty apps

by keith.abbott In reply to yes it's true in that you ...

They reside in someone else's ballpark and they won't modify them.

And there is still the issue of the 2 domains. does \\myNTserver\thisshare refer to \\myNTserver.NTdom.com\thisshare or \\myNTserver.ADdom.com\thisshare.

So I am not sure we can get away from this scenario in the short term. But if it is more reliable to handle it on the host, than it is via DHCP or group policy, or more reliable to handle it some other way, we want to do that.

k

Collapse -

3 days clean

by keith.abbott In reply to users' SID is your proble ...

So far our test user has gone 3 days without a reboot (although the first one doesn't really count since I was on there mucking, the night before).

We'll see where it goes. If we can get the clients clean, we can continue with the migration.

But that raises another question. Should we skip migration all together, remove the machine from the NT domain manually add it to the AD domain, have the user log in, then copy the stuff from their old profile to their new one?

It's not ideal because a lot of user settings don't come over with that method (but its better than these problems)

What do you think?

k

Collapse -

so what method did you use to get your "clean client"

by CG IT In reply to 3 days clean

unjoin NT, rejoin AD, redo new AD profile with old NT profile?

Back to After Hours Forum
27 total posts (Page 2 of 3)   Prev   01 | 02 | 03   Next

Related Discussions

Related Forums