Jump to content
radek.rous

Network Agent stability on clients [In progress]

Recommended Posts

Hi I have kind of big problem with Network Agents and their stability. In last 2 months we deploy over 3000 machines and now I have more than 300 machines reported "Not connected for a long time". I assume some of them are correct, but at least 70% are valid machines.

Problem is those machines are running KES 10MR1 and their Network Agents have corrupted files and don't work there, or agent is working but doesn't report correct information about KES like protection and definition DB and need reinstallation.

 

Additional issue is we have active and roaming policy for agents. Roaming one enable firewall feature which cause issue in case NA is broken computer doesn't respond to network discovery and computer is not remotely reachable and cannot be fixed remotely.

 

Is there any way how to harden stability of agent?

Without valid information on KSC i'm not able to manage it correctly.

Also would be interesting to include agent in to the KES directly. Are there any plants to do that?

 

we are running latest available KSC 10 agents patch C (patch has been installed after agent installation).

Share this post


Link to post

Please check the setting of the agent. Do you use FQDN or Ip-address?

Share this post


Link to post
Is there any way how to harden stability of agent?

One thing you can do is to check the Recovery option of the network agent service. By default it is set to "restart service after 0 minutes", it frequently results into situation where crashing agent starts writing its dump, is being restarted while dumping, tries to dump again... and eventually, is away permanently (until reinstall). Try to change the recovery interval to 2-3 minutes.

Share this post


Link to post
One thing you can do is to check the Recovery option of the network agent service. By default it is set to "restart service after 0 minutes", it frequently results into situation where crashing agent starts writing its dump, is being restarted while dumping, tries to dump again... and eventually, is away permanently (until reinstall). Try to change the recovery interval to 2-3 minutes.

 

 

Hello.

 

Firstly, please specify, did You try to restart network agent service?

 

Please also specify the version of network agent installed including the buils.

 

Thank You.

Share this post


Link to post
Firstly, please specify, did You try to restart network agent service?

Please also specify the version of network agent installed including the buils.

Of course I did. And the (re)start attempts (including just-after-reboot) fail if the agent runs into this kind of problems. They fail forever then. It's happened to 5-10% of my installed base and it sometimes still happens when I forget to change recovery from 0 to 3 after new installations (network agent is not a part of my image).

The version is anything up to 10.1.249 patch b (including earlier v 9.*). For 10.1.249 patch c (running right now on my machine) I cannot state whether the problem is still in place -- for most of the installations I have adjusted the recovery timing before problems of the kind manifested themselves, and patch c is relatively young.

Share this post


Link to post
Of course I did. And the (re)start attempts (including just-after-reboot) fail if the agent runs into this kind of problems. They fail forever then. It's happened to 5-10% of my installed base and it sometimes still happens when I forget to change recovery from 0 to 3 after new installations (network agent is not a part of my image).

The version is anything up to 10.1.249 patch b (including earlier v 9.*). For 10.1.249 patch c (running right now on my machine) I cannot state whether the problem is still in place -- for most of the installations I have adjusted the recovery timing before problems of the kind manifested themselves, and patch c is relatively young.

Hello,

could you try

at first create remote remove agent task

and after that to install agent again.

We're waiting reply.

Thanks.

Share this post


Link to post
Hello,

could you try

at first create remote remove agent task

and after that to install agent again.

We're waiting reply.

Thanks.

Em... sorry, but I don't have any problematic installation (of this kind) at hand right now. That's why I dared to recommend the approach to Inkognitos, after all.

But if I had I would be unable to "at first create remote remove agent task". Or, actually, I would able to create it, but unable to run, as agent would have crashed at startup, before removing itself.

Share this post


Link to post
Em... sorry, but I don't have any problematic installation (of this kind) at hand right now. That's why I dared to recommend the approach to Inkognitos, after all.

But if I had I would be unable to "at first create remote remove agent task". Or, actually, I would able to create it, but unable to run, as agent would have crashed at startup, before removing itself.

 

Hello.

 

Even if writing the dump takes time to complete, this shouldn't cause the Network Agent to fail to restart.

If you provided said dump it could be seen what caused the Agent to fail in the first place.

Share this post


Link to post

How can I check this ? "Please check the setting of the agent. Do you use FQDN or Ip-address?" I didn't find anything like this in agent policy.

 

Changing recovery option for KLNagent service on 300 machines would be a kind of problem now.

 

Most of those agents have reported corrupted files in Kaspersky event logs, so start/restart of service doesn't help, because corrupted files must be deleted before start of service.

Some of the issues are also agent is running, but is somehow wrong connected with KES, so it reports wrong protection status. In this case it has to be reinstalled.

 

With my initial question I tried mainly find out the way to prevent all those situation as I don't know why agent files have been corrupted or communication with KES has been lost. Maybe problems are caused by installation of new versions of agents or some KES patches (pf).

We had lot of other issues with KES clients and most of the machines have installed additional 2-4 PFs installed.

 

Every broken agent cause issues with reporting and managing station for other systems. User might not figure that out directly and report other issue.

 

version 10.1.249 patch C

 

Share this post


Link to post
How can I check this ? "Please check the setting of the agent. Do you use FQDN or Ip-address?" I didn't find anything like this in agent policy.

 

Changing recovery option for KLNagent service on 300 machines would be a kind of problem now.

 

Most of those agents have reported corrupted files in Kaspersky event logs, so start/restart of service doesn't help, because corrupted files must be deleted before start of service.

Some of the issues are also agent is running, but is somehow wrong connected with KES, so it reports wrong protection status. In this case it has to be reinstalled.

 

With my initial question I tried mainly find out the way to prevent all those situation as I don't know why agent files have been corrupted or communication with KES has been lost. Maybe problems are caused by installation of new versions of agents or some KES patches (pf).

We had lot of other issues with KES clients and most of the machines have installed additional 2-4 PFs installed.

 

Every broken agent cause issues with reporting and managing station for other systems. User might not figure that out directly and report other issue.

 

version 10.1.249 patch C

 

What versions of KES are you using? Is the issue with Network Agent limited to computers with a specific version of KES?

Share this post


Link to post
What versions of KES are you using? Is the issue with Network Agent limited to computers with a specific version of KES?

 

All computers are running KES10 MR1 most of them with autopatch A (not all, I think this is caused by some PFs).

Share this post


Link to post
All computers are running KES10 MR1 most of them with autopatch A (not all, I think this is caused by some PFs).

 

Please clarify which PFs you have installed.

Share this post


Link to post
Please clarify which PFs you have installed.

 

Version of KES could be up to 10.2.1.23 (a.pf338.pf344.pf440.pf488.pf494).

 

Most of machines have pf344. Almost 1200 pf440 and couple hundreds 488. Only couple machines with 338, 494.

Share this post


Link to post
Version of KES could be up to 10.2.1.23 (a.pf338.pf344.pf440.pf488.pf494).

 

Most of machines have pf344. Almost 1200 pf440 and couple hundreds 488. Only couple machines with 338, 494.

 

To investigate this incident further, please create a CompanyAccount incident and let us know its number.

 

Thank you.

Share this post


Link to post

×
×
  • Create New...

Important Information

We use cookies to make your experience of our websites better. By using and further navigating this website you accept this. Detailed information about the use of cookies on this website is available by clicking on more information.