Questions

Help with PEAP NPS and CIsco AP please!?

+
0 Votes
Locked

Help with PEAP NPS and CIsco AP please!?

mattsmith2006
Hi,

I've configured a PEAP, NPS Cisco AP environment at head office on a Server 2008 machine and it's working as expected. But when I've deployed this solution to other sites it doesn't work!
The difference between the sites is as follows:
1. The NPS server on the sites are Server 2008R2 the head office site is Server 2008.
2. The head office certificate server is on the same box as the NPS server. The remote site NPS servers receive their self-signed certificate from this head office Certificate server.
3. The clients in the head office have a copy of the NPS server certificate in Trusted Root Certificate Authorities. Other site clients do not have a copy of their NPS server certificate.

What works:
I am running two SSID's per AP - one performs PEAP authentication (this part doesn't work). The other SSID performs WPA2 password authentication (this part works). Since the WPA2 password part works then I assume the AP is registered correctly with the NPS server and the AP log indicates this is OK.
The cisco AP config is identical across all sites.

The symptoms are:
Windows 7
Client: "windows was unable to connect to this network" No event logs
Server: No event logs and the same NPS log entry as below.

Windows XP
Client: Wireless status "Attempting to Authenticate". Stops then repeats forever.... No relevent event logs.
NPS Server: No event logs. NPS log as follows:

<Event><Timestamp data_type="4">08/31/2011 14:09:20.220</Timestamp>
Computer-Name data_type="1" SERVER_NAME
Event-Source data_type="1" IAS</Event-Source>
Framed-MTU data_type="0" 1400</Framed-MTU>
Called-Station-Id data_type="1" e804.625e.3250</Called-Station-Id>
Calling-Station-Id data_type="1" 001c.bf87.1ab6</Calling-Station-Id>
Service-Type data_type="0" 1</Service-Type>
NAS-Port-Type data_type="0" 19</NAS-Port-Type>
NAS-Port data_type="0" 438</NAS-Port>
NAS-Port-Id data_type="1" 438</NAS-Port-Id>
NAS-IP-Address data_type="3" 10.232.240.140</NAS-IP-Address>
NAS-Identifier data_type="1" DENI-AP01</NAS-Identifier>
Client-IP-Address data_type="3" 10.232.240.140</Client-IP-Address>
Client-Vendor data_type="0" 9</Client-Vendor>
Client-Friendly-Name data_type="1" DENI-AP01</Client-Friendly-Name>
User-Name data_type="1" USERNAME</User-Name>
Proxy-Policy-Name data_type="1" Deni Secure Wireless Connections</Proxy-Policy-Name>
Provider-Type data_type="0" 1</Provider-Type>
SAM-Account-Name data_type="1" USERNAME</SAM-Account-Name>
Class data_type="1" 311 1 10.232.240.48 08/30/2011 05:21:16 954</Class>
Authentication-Type data_type="0" 5</Authentication-Type>
NP-Policy-Name data_type="1" Deni Wireless New</NP-Policy-Name>
Fully-Qualifed-User-Name data_type="1" USERNAME</Fully-Qualifed-User-Name>
Quarantine-Update-Non-Compliant data_type="0" 1</Quarantine-Update-Non-Compliant>
Packet-Type data_type="0" 1</Packet-Type>
Reason-Code data_type="0" 0</Reason-Code></Event>

Event><Timestamp data_type="4" 08/31/2011 14:09:20.220</Timestamp>
Computer-Name data_type="1" SERVER_NAME</Computer-Name>
Event-Source data_type="1" IAS</Event-Source>
Class data_type="1" 311 1 10.232.240.48 08/30/2011 05:21:16 954</Class>
Session-Timeout data_type="0" 30</Session-Timeout>
Fully-Qualifed-User-Name data_type="1" USERNAME</Fully-Qualifed-User-Name>
Client-IP-Address data_type="3" 10.232.240.140</Client-IP-Address>
Client-Vendor data_type="0" 9</Client-Vendor>
Client-Friendly-Name data_type="1" DENI-AP01</Client-Friendly-Name>
Proxy-Policy-Name data_type="1" Deni Secure Wireless Connections</Proxy-Policy-Name>
Provider-Type data_type="0" 1</Provider-Type>
SAM-Account-Name data_type="1" USERNAME</SAM-Account-Name>
Quarantine-Update-Non-Compliant data_type="0" 1</Quarantine-Update-Non-Compliant>
Authentication-Type data_type="0" 5</Authentication-Type>
NP-Policy-Name data_type="1" Deni Wireless New</NP-Policy-Name>
Packet-Type data_type="0" 11</Packet-Type>
Reason-Code data_type="0" 0</Reason-Code></Event>

Clarification?
I assume the problem is with the NPS configuration (which was configured with the wizard)
Connection Request Policy - this seems OK? There's not much in here: NAS Port Type = Wireless Other or Wireless 802.11
Network Policies -
Windows Groups: Wireless Users or Wireless Computers
Constraints: Authentication = PEAP: Includes self signed cert issued to NPS server. EAP type = EAP-MSCHAP v2
Everything else is default.

I think the issue may be the certificate? My (limited) understanding is that the only the server needs to have a copy of the certificate (group policy for the client tells the client NOT to validate the certificate as it is self signed). Is this correct?

I've tried setting this up at 3 sites. First site was OK, second site was OK. Then after a day or 2 (and probbaly a server reboot) both sites stopped working. Set it up on the 3rd site and can't get it working at all.

Any thoughts or insights would be HUGELY appreciated!!

Thanks!
  • +
    0 Votes
    robo_dev

    Windows AD authentication uses Kerberos, which according to the RFC, uses UDP. UDP, being a connectionless protocol, cannot deal with out-of-order packets which can be the result of either different MTU sizes or general latency/jitter issues over WAN connections.

    The fix may be be re-configure the workstations to use TCP for Kerberos (simple registry change).
    http://blog.besida.net/2009/09/how-to-force-kerberos-to-use-tcp.html

    You've probably done this, but this sort of thing can happen if APs are set to Shared Authentication and the client are set to 'Open'.

    And also, there are two settings that mess up MANY WLAN workstations:
    a) broadcasting the SSID
    b) using DHCP

    So as a test, try a static address while broadcasting the SSID.

    Note that Cisco AP debug commands are very powerful. You can watch the authentication process in real time, which may give some clues.

    +
    0 Votes
    mattsmith2006

    Thanks for the reply.

    I've had a look at the Cisco AP debug logs and it all looks good until I get to this:


    Sep 1 04:59:00.998: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:00.998: RADIUS: Retransmit to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:05.542: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:05.542: RADIUS: Retransmit to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:09.862: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:09.862: RADIUS: Retransmit to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:14.246: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:14.246: RADIUS: Retransmit to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:19.078: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:19.078: RADIUS: Fail-over denied to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:19.078: RADIUS: No response from (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:19.078: RADIUS/DECODE: No response from radius-server; parse response; FAIL
    Sep 1 04:59:19.078: RADIUS/DECODE: Case error(no response/ bad packet/ op decode);parse response; FAIL
    Sep 1 04:59:19.078: dot11_auth_dot1x_parse_aaa_resp: Received server response:FAILOVER_RETRY
    Sep 1 04:59:19.078: dot11_auth_dot1x_parse_aaa_resp: found eap pak in server response
    Sep 1 04:59:19.078: Client 001c.bf87.1ab6 failed: EAP reason 0
    Sep 1 04:59:19.078: dot11_auth_dot1x_parse_aaa_resp: Failed client 001c.bf87.1ab6 with aaa_req_status_detail 0
    Sep 1 04:59:19.078: dot11_auth_dot1x_run_rfsm: Executing Action(SERVER_WAIT,SERVER_FAIL) for 001c.bf87.1ab6
    Sep 1 04:59:19.079: dot11_auth_dot1x_send_response_to_client: Forwarding server message to client 001c.bf87.1ab6
    Sep 1 04:59:19.079: dot11_auth_dot1x_send_response_to_client: Started timer client_timeout 30 seconds
    Sep 1 04:59:19.079: dot11_auth_dot1x_send_client_fail: Authentication failed for 001c.bf87.1ab6
    Sep 1 04:59:19.079: %DOT11-7-AUTH_FAILED: Station 001c.bf87.1ab6 Authentication failed

    NOTE:
    Sep 1 04:59:19.078: RADIUS/DECODE: No response from radius-server; parse response; FAIL

    On other forums ppl have suggested this is a RADIUS password issue. But I get
    Client 001c.bf87.1ab6 failed: EAP reason 0

    If I change the password on the NPS to make them different I get:
    Client 001c.bf87.1ab6 failed: EAP reason 4

    So it's not a password error...

    Microsoft Network Monitor gives me this

    216 2:58:56 PM 1/09/2011 41.6273673 CISCO_AP SERVER_NAME EAP EAP:Response, Type = Identity {EAP:36, RADIUS:35, UDP:34, IPv4:33}
    219 2:58:56 PM 1/09/2011 41.6754517 SERVER_NAME CISCO_AP EAP EAP:Request, Type = PEAP,PEAP start {EAP:36, RADIUS:35, UDP:34, IPv4:33}
    220 2:58:56 PM 1/09/2011 41.6851215 CISCO_AP SERVER_NAME TLS TLS:TLS Rec Layer-1 HandShake: Client Hello. {TLS:40, SSLVersionSelector:39, EAP:38, RADIUS:37, UDP:34, IPv4:33}
    221 2:58:56 PM 1/09/2011 41.6855813 SERVER_NAME CISCO_AP TLS TLS:TLS Rec Layer-1 HandShake: Server Hello. Certificate. {TLS:40, SSLVersionSelector:39, EAP:38, RADIUS:37, UDP:34, IPv4:33}

    ...repeats...

    So it appears to me to be an issue between the AP and the RADIUS server. The RADIUS server uses a certificate issued from out Server2008 CA.

    Aligning time on the packets it looks like this packet isn't received/accepted by the AP.
    TLS:TLS Rec Layer-1 HandShake: Server Hello. Certificate. {TLS:40, SSLVersionSelector:39, EAP:38, RADIUS:37, UDP:34, IPv4:33}

    Any thoughts on what this might mean?

    Thanks!

    +
    0 Votes
    mattsmith2006

    Oh yeah - I did try your other suggestions to no avail...

    +
    0 Votes
    robo_dev

    Triple check the port numbers here.....

    The port values of 1812 for authentication and 1813 for accounting are RADIUS standard ports defined by the Internet Engineering Task Force (IETF) in RFCs 2865 and 2866. However, by default, many access servers use ports 1645 for authentication requests and 1646 for accounting requests. No matter which port numbers you decide to use, make sure that NPS and your access server are configured to use the same ones.

    In this case are you using APs which ALSO have internal RADIUS?

    RADIUS uses UDP for authentication....UDP cannot tolerate issues with MTU differences or or out-of-order packets. And I assume there is no firewall or VLAN ACL in between these two devices?

    +
    0 Votes
    mattsmith2006

    I tried both the 1645/1646 & 1812/1813 combo's (changing the ports on both the AP's and the RADIUS server)in case that made a difference, which it didn't.

    UDP cannot tolerate issues with MTU differences or or out-of-order packets.
    I'm not sure how to test for this? I do notice MTU =1400 in the Cisco debug.

    The RADIUS and AP are connected to the same switch (tried Cisco, HP & a desktop DLink switches but no difference). no VLAN

    How would I put wireshark on a cisco AP interface? A computer with 2 NICs sitting in between the AP & switch?

    +
    0 Votes
    robo_dev

    Plug that in between the AP and switch, and plug your Wireshark box into that.

    Technically you can do port-mirroring from one switch port to a second switch port on most managed switches, but the lazy/simple approach is to use a hub. Two nics won't help because a switched port does not see the directed traffic on other ports.

    Wireshark should show you both the packets sent and packets received, therefore you should be able to see what's what.

    Most connections should be OK with 1400 as MTU.

    A simple thing to check for is to make sure the Ethernet port to the AP is not mis-negotiating duplex settings. It's not unusual for the switch port to be set to auto, but the device (also set to auto) connects at half-duplex, while the switch goes full-duplex. This causes CRC errors and packet loss that you see in the stats of the LAN switch. It could also mess up your RADIUS authentication.

    +
    0 Votes
    NetMan1958

    In your OP you stated:
    "3. The clients in the head office have a copy of the NPS server certificate in Trusted Root Certificate Authorities. Other site clients do not have a copy of their NPS server certificate."
    Have you tried installing the root CA cert on a client at one of the sites?

  • +
    0 Votes
    robo_dev

    Windows AD authentication uses Kerberos, which according to the RFC, uses UDP. UDP, being a connectionless protocol, cannot deal with out-of-order packets which can be the result of either different MTU sizes or general latency/jitter issues over WAN connections.

    The fix may be be re-configure the workstations to use TCP for Kerberos (simple registry change).
    http://blog.besida.net/2009/09/how-to-force-kerberos-to-use-tcp.html

    You've probably done this, but this sort of thing can happen if APs are set to Shared Authentication and the client are set to 'Open'.

    And also, there are two settings that mess up MANY WLAN workstations:
    a) broadcasting the SSID
    b) using DHCP

    So as a test, try a static address while broadcasting the SSID.

    Note that Cisco AP debug commands are very powerful. You can watch the authentication process in real time, which may give some clues.

    +
    0 Votes
    mattsmith2006

    Thanks for the reply.

    I've had a look at the Cisco AP debug logs and it all looks good until I get to this:


    Sep 1 04:59:00.998: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:00.998: RADIUS: Retransmit to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:05.542: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:05.542: RADIUS: Retransmit to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:09.862: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:09.862: RADIUS: Retransmit to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:14.246: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:14.246: RADIUS: Retransmit to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:19.078: RADIUS: no sg in radius-timers: ctx 0x11D0E08 sg 0x0000
    Sep 1 04:59:19.078: RADIUS: Fail-over denied to (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:19.078: RADIUS: No response from (10.232.240.48:1812,1813) for id 1645/123
    Sep 1 04:59:19.078: RADIUS/DECODE: No response from radius-server; parse response; FAIL
    Sep 1 04:59:19.078: RADIUS/DECODE: Case error(no response/ bad packet/ op decode);parse response; FAIL
    Sep 1 04:59:19.078: dot11_auth_dot1x_parse_aaa_resp: Received server response:FAILOVER_RETRY
    Sep 1 04:59:19.078: dot11_auth_dot1x_parse_aaa_resp: found eap pak in server response
    Sep 1 04:59:19.078: Client 001c.bf87.1ab6 failed: EAP reason 0
    Sep 1 04:59:19.078: dot11_auth_dot1x_parse_aaa_resp: Failed client 001c.bf87.1ab6 with aaa_req_status_detail 0
    Sep 1 04:59:19.078: dot11_auth_dot1x_run_rfsm: Executing Action(SERVER_WAIT,SERVER_FAIL) for 001c.bf87.1ab6
    Sep 1 04:59:19.079: dot11_auth_dot1x_send_response_to_client: Forwarding server message to client 001c.bf87.1ab6
    Sep 1 04:59:19.079: dot11_auth_dot1x_send_response_to_client: Started timer client_timeout 30 seconds
    Sep 1 04:59:19.079: dot11_auth_dot1x_send_client_fail: Authentication failed for 001c.bf87.1ab6
    Sep 1 04:59:19.079: %DOT11-7-AUTH_FAILED: Station 001c.bf87.1ab6 Authentication failed

    NOTE:
    Sep 1 04:59:19.078: RADIUS/DECODE: No response from radius-server; parse response; FAIL

    On other forums ppl have suggested this is a RADIUS password issue. But I get
    Client 001c.bf87.1ab6 failed: EAP reason 0

    If I change the password on the NPS to make them different I get:
    Client 001c.bf87.1ab6 failed: EAP reason 4

    So it's not a password error...

    Microsoft Network Monitor gives me this

    216 2:58:56 PM 1/09/2011 41.6273673 CISCO_AP SERVER_NAME EAP EAP:Response, Type = Identity {EAP:36, RADIUS:35, UDP:34, IPv4:33}
    219 2:58:56 PM 1/09/2011 41.6754517 SERVER_NAME CISCO_AP EAP EAP:Request, Type = PEAP,PEAP start {EAP:36, RADIUS:35, UDP:34, IPv4:33}
    220 2:58:56 PM 1/09/2011 41.6851215 CISCO_AP SERVER_NAME TLS TLS:TLS Rec Layer-1 HandShake: Client Hello. {TLS:40, SSLVersionSelector:39, EAP:38, RADIUS:37, UDP:34, IPv4:33}
    221 2:58:56 PM 1/09/2011 41.6855813 SERVER_NAME CISCO_AP TLS TLS:TLS Rec Layer-1 HandShake: Server Hello. Certificate. {TLS:40, SSLVersionSelector:39, EAP:38, RADIUS:37, UDP:34, IPv4:33}

    ...repeats...

    So it appears to me to be an issue between the AP and the RADIUS server. The RADIUS server uses a certificate issued from out Server2008 CA.

    Aligning time on the packets it looks like this packet isn't received/accepted by the AP.
    TLS:TLS Rec Layer-1 HandShake: Server Hello. Certificate. {TLS:40, SSLVersionSelector:39, EAP:38, RADIUS:37, UDP:34, IPv4:33}

    Any thoughts on what this might mean?

    Thanks!

    +
    0 Votes
    mattsmith2006

    Oh yeah - I did try your other suggestions to no avail...

    +
    0 Votes
    robo_dev

    Triple check the port numbers here.....

    The port values of 1812 for authentication and 1813 for accounting are RADIUS standard ports defined by the Internet Engineering Task Force (IETF) in RFCs 2865 and 2866. However, by default, many access servers use ports 1645 for authentication requests and 1646 for accounting requests. No matter which port numbers you decide to use, make sure that NPS and your access server are configured to use the same ones.

    In this case are you using APs which ALSO have internal RADIUS?

    RADIUS uses UDP for authentication....UDP cannot tolerate issues with MTU differences or or out-of-order packets. And I assume there is no firewall or VLAN ACL in between these two devices?

    +
    0 Votes
    mattsmith2006

    I tried both the 1645/1646 & 1812/1813 combo's (changing the ports on both the AP's and the RADIUS server)in case that made a difference, which it didn't.

    UDP cannot tolerate issues with MTU differences or or out-of-order packets.
    I'm not sure how to test for this? I do notice MTU =1400 in the Cisco debug.

    The RADIUS and AP are connected to the same switch (tried Cisco, HP & a desktop DLink switches but no difference). no VLAN

    How would I put wireshark on a cisco AP interface? A computer with 2 NICs sitting in between the AP & switch?

    +
    0 Votes
    robo_dev

    Plug that in between the AP and switch, and plug your Wireshark box into that.

    Technically you can do port-mirroring from one switch port to a second switch port on most managed switches, but the lazy/simple approach is to use a hub. Two nics won't help because a switched port does not see the directed traffic on other ports.

    Wireshark should show you both the packets sent and packets received, therefore you should be able to see what's what.

    Most connections should be OK with 1400 as MTU.

    A simple thing to check for is to make sure the Ethernet port to the AP is not mis-negotiating duplex settings. It's not unusual for the switch port to be set to auto, but the device (also set to auto) connects at half-duplex, while the switch goes full-duplex. This causes CRC errors and packet loss that you see in the stats of the LAN switch. It could also mess up your RADIUS authentication.

    +
    0 Votes
    NetMan1958

    In your OP you stated:
    "3. The clients in the head office have a copy of the NPS server certificate in Trusted Root Certificate Authorities. Other site clients do not have a copy of their NPS server certificate."
    Have you tried installing the root CA cert on a client at one of the sites?