c-dot 9.3 How to create a local group with selecting the rid ?

October 9, 2018, 10:31 am

≫ Next: System Configuration backup (7Mode & cDOT)

≪ Previous: 7-Mode cascaded SnapMirror initialization

7MTT failed to transfer a local group to my c-Dot system, so i made a backup of lclgroup.cfg - deleted the group, applied successfull the configuration and reloaded the backup with "useradmin domainuser load" . How can i create the corresponding group on the destination SVM with the same RID to fit it with the ACLs on my volumes ?

↧

System Configuration backup (7Mode & cDOT)

October 9, 2018, 10:56 am

≫ Next: Syslog Event Source IPs

≪ Previous: c-dot 9.3 How to create a local group with selecting the rid ?

Hi All,

I would like to backup system configuration of Data Ontap 7mode as well as cDOT.

What is the best approach/method to take the full configuration backup. Lastly, I want know at what stage backup configuration will work for us ?

For example... I have a 10 node cdot cluster and If I lost 2/3 nodes (crash), what are the steps we need to take to restore the nodes and restore the config like before.

Thanks in advance.

Arsalan.

↧

Syslog Event Source IPs

October 10, 2018, 7:21 am

≫ Next: Syslog Port

≪ Previous: System Configuration backup (7Mode & cDOT)

Guys, I have 2 separate C-Mode 9.3 clusters that syslog events to an external syslog server. This is generally working fne and both have the exact same policy defined. However, one one cluster I get the following:

syslogs with source IP for each node and each cluster IP i.e. each node is sending syslogs and each cluster is

On the other one though I only get logs form the nod IPs and not the cluster.

Is this configureable somewhere?

↧

Syslog Port

October 10, 2018, 7:25 am

≫ Next: OSSV Backup failed

≪ Previous: Syslog Event Source IPs

Is is possible to change the destination port used for a notification destination in C-Mode 9.3?

↧

OSSV Backup failed

October 10, 2018, 8:30 am

≫ Next: Filer Takeover-Continues to fail back

≪ Previous: Syslog Port

Hello All,

Kindly help me some one how to do basic troubleshoot OSSV backups fails. in my environment there are so many ossv backups failed.

setup is : primary volume to server from server to secondary filer.

↧

Filer Takeover-Continues to fail back

October 10, 2018, 12:22 pm

≫ Next: OnTAP 9..3x Powershell

≪ Previous: OSSV Backup failed

Problem Description: cfbi01n2 has taken over cfbi01n1

Summary: Ticket for a failover on an IBM N7900 filer cfbi01n2 has taken over cfbi01n1. Over this period I’ve resolved multiple existing errors on the filers. After fixing three intermittent FC errors, fixing the network issue and then the issue with the AT-FCX modules, cfbi01n1 Panicked with the following error below. I’m no longer seeing any hardware related issues or reasoning the giveback fails other than the below panic message. We also upgraded the software to the latest version 8.2.5.P1.

Old Panic Message:
Tue Sep 25 19:31:04 CDT [cfbi01n1:mgr.stack.string:notice]: Panic string: protection fault on VA 0 code = 0 cs:rip = 0x20:0xffffffff83f74e2e in SK process wafl_hipri on release 8.2.4P4 :: Which has since cleared.

New Panic Error

Tue Oct 9 18:41:30 CDT [cfbi01n1:mgr.partner.stack.saved:notice]: Cluster takeover has saved partPanic string: protection fault on VA 0 code = 0 cs:rip = 0x20:0xffffffff86429c01 in SK process NwkThd_03 on release 8.2.5P1

The following summarizes certain events and related action items.

8/6/18- confirmed we are not seeing any hardware errors cfbi01n2 & n1 after the visual triage. We checked the SFPs, cables associated with 2a & 8b. Requested NAS team perform a cf giveback.

8/15/18– While troubleshooting the 2a/8b FC data ports it was discovered that these filer nodes are each only connected to one FC loop (cfbi01n1 to Loop A, cfbi01n2 to Loop B) for the following FC data interface ports: 2a/8b, 2b/8a, 2c/8c. The other FC data interface ports on these filers are configured for dual FC loop connections: 0a/0d, 0b/0e, 0c/0h. This leads to what is called a "mix-path" configuration (partial single path, partial dual path). This type of configuration is susceptible to single point of failure outages for the data configured for single path only.

8/17/18- To summarize the activity performed on cfbi01n1/cfbi01n2 to bring cfbi01n1 back into the HA config:
1. Recheck/reseat all cabling, SFPs and shelf modules for cfbi01n1 ports 2a/8b
2. Replacement of possible failing module in shelf 1 for FC loop A (the only active/configured loop for cfbi01n1)
3. Test loop A connectivity for cfbi01n1 and note intermittent error with shelf 4 for FC loop A
4. Replacement of module in shelf 4 for FC loop A
5. Boot of cfbi01n1 no longer blocked due to multiple loop A errors for ports 2a/8b and node is in “ready for giveback” state
6. Console logs are still showing intermittent error for 2a.63/8b.63 (shelf 4), but healthy for giveback to join the HA
7. Giveback completed for cfbi01n1, triggering autosupports from both nodes to review
8. Giveback failed within an hour

8/18/18- NAS team received issues reported from clients that NAS storage is disconnected state on 70 ESXi hosts and cfbi01n1-nas10.sldc.sbc.com was not connecting from ESXi and windows Jump server. However, after node taken over back again issue got resolved which we suspect route issue. NAS team resolved internally

9/6/18- We have issues with cfbi01n1 again. We had 80 servers that were not able to access the nas10. We could not ping the gateway. I had T2 do a vif failover & they were then able to ping the gateway, but the filer panicked & did a failover. Core file is currently dumping. are not able to ping the gateway (130.1.71.3) from the NetApp arrays. When we tried to failover to the other interface in the ifgroup (form e5a to e5b) it caused the filer to panic

9/10/18- opened the case with Cisco and identified the module as faulty. It was replaced by Cisco on 9/13/18.

9/25/18- Next window for giveback was approved. We identified faults on 2c/8c prior to the filer failing back over. On cfbi01n1 we noticed 8c as hard down and 2c as hard down on cfbi01n2. Our FE remained onsite as we suspected a fault on the filer. We proceeded to trace the cables from both filer heads and found the shelves in question. We noticed LED lights out on shelf 1, module A/ In port & shelf 6, module A/Out port. After authorization from the NAS team we replaced both I/O Modules. The LED status came back to solid green and the NAS team proceeded to do a cf giveback, failed again within 1 hour

10/9/18- Upgraded to 8.2.5P1. failed within an 1.5 hours

↧

OnTAP 9..3x Powershell

October 9, 2018, 12:24 pm

≫ Next: Error reverting cluster LIF to home port after upgrade

≪ Previous: Filer Takeover-Continues to fail back

Does the Powershell toolkit support the set-location cmdlet?

It would be nice to be able to get a directory listing via PS, without creating a CIFS share or exporting and NFS mount. Especially when searching for files recursively. I am surprised that the toolkit does not support the set-location and PSDrives to OnTap shares.

Or... does it?

And BTW, where is the PS community for ONTAP?

↧

Error reverting cluster LIF to home port after upgrade

October 11, 2018, 3:54 am

≫ Next: Disable CP type U triggers

≪ Previous: OnTAP 9..3x Powershell

Hello,

we made an upgrade in August to ONATP version 9.4P1

Today I was checking some config in our cluster and I have seen, that two cluster lifs are not on their home port:

network interface show -role cluster                                                                         
            Logical    Status     Network            Current       Current Is
Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home
----------- ---------- ---------- ------------------ ------------- ------- ----
Cluster
            x-01_clus1 up/up  169.254.4.144/16   x-01      e0b     false
            x-01_clus2 up/up  169.254.130.246/16 x-01      e0c     true
            x-01_clus3 up/up  169.254.131.229/16 x-01      e0b     true
            x-01_clus4 up/up  169.254.168.120/16 x-01      e0d     true
            x-02_clus1 up/up  169.254.89.228/16  x-02      e0b     false
            x-02_clus2 up/up  169.254.106.140/16 x-02      e0c     true
            x-02_clus3 up/up  169.254.26.197/16  x-02      e0b     true
            x-02_clus4 up/up  169.254.4.232/16   x-02      e0d     true
8 entries were displayed.

As you see here, two lifs are not at home... HomePort would be e0a

network port show -role cluster                                                                              

Node: x-01
                                                                       Ignore
                                                  Speed(Mbps) Health   Health
Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
--------- ------------ ---------------- ---- ---- ----------- -------- ------
e0a       Cluster      Cluster          up   9000  auto/10000 healthy  false
e0b       Cluster      Cluster          up   9000  auto/10000 healthy  false
e0c       Cluster      Cluster          up   9000  auto/10000 healthy  false
e0d       Cluster      Cluster          up   9000  auto/10000 healthy  false

Node: x-02
                                                                       Ignore
                                                  Speed(Mbps) Health   Health
Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
--------- ------------ ---------------- ---- ---- ----------- -------- ------
e0a       Cluster      Cluster          up   9000  auto/10000 healthy  false
e0b       Cluster      Cluster          up   9000  auto/10000 healthy  false
e0c       Cluster      Cluster          up   9000  auto/10000 healthy  false
e0d       Cluster      Cluster          up   9000  auto/10000 healthy  false
8 entries were displayed.


network interface failover-groups show -vserver Cluster -failover-group Cluster 
       Vserver Name: Cluster
Failover Group Name: Cluster
   Failover Targets: x-02:e0a, x-02:e0b, x-02:e0c, x-02:e0d,
                     x-01:e0a, x-01:e0b, x-01:e0c, x-01:e0d
   Broadcast Domain: Cluster


network interface show -vserver Cluster -lif x-01_clus1 -instance 

                               Vserver Name: Cluster
                     Logical Interface Name: x-01_clus1
                                       Role: cluster
                              Data Protocol: none
                            Network Address: 169.254.4.144
                                    Netmask: 255.255.0.0
                        Bits in the Netmask: 16
                                Subnet Name: -
                                  Home Node: x-01
                                  Home Port: e0a
                               Current Node: x-01
                               Current Port: e0b
                         Operational Status: up
                            Extended Status: -
                                 Numeric ID: 1024
                                    Is Home: false
                      Administrative Status: up
                            Failover Policy: local-only
                            Firewall Policy: 
                                Auto Revert: true
                                Sticky Flag: false
              Fully Qualified DNS Zone Name: none
                    DNS Query Listen Enable: false
(DEPRECATED)-Load Balancing Migrate Allowed: false
                       Load Balanced Weight: load
                        Failover Group Name: Cluster
                                   FCP WWPN: -
                             Address family: ipv4
                                    Comment: -
                             IPspace of LIF: Cluster
             Is Dynamic DNS Update Enabled?: -

When I try to revert the cluster lif on the correct node (I know, that you can do this only on the local node), I get the following error:

network interface revert -vserver Cluster -lif x-01_clus1

->
Error: command failed: LIF "x-01_clus1" failed to migrate: failed to move cluster/node-mgmt LIF.

Also a migration to that port is failing with the same error.

On the clsuter switch, I saw some errors on that interface:

show interface 0/1

Packets Received Without Error................. 5389711331
Packets Received With Error.................... 1139
Broadcast Packets Received..................... 3128
Receive Packets Discarded...................... 0
Packets Transmitted Without Errors............. 6369143239
Transmit Packets Discarded..................... 0
Transmit Packet Errors......................... 0
Collision Frames............................... 0
Number of link down events..................... 6

I have shutted down that port and cleared the counters, but also then, I wasn't able to revert the lif...

I then shutted down the home port e0a and enabled again:

network port modify -node x-01 -port e0a -up-admin false

network port modify -node x-01 -port e0a -up-admin true

-> didn't helped! Same issue

Then I treid migrate with force flag:

net int migrate -vserver Cluster -lif x-01_clus1 -destination-node x-01 -destination-port e0a -force

->
Warning: Migrating LIF "x-01_clus1" to node "x-01" using the "force" parameter might cause this LIF to be configured on multiple nodes in the cluster. Use the "network interface show -vserver
         Cluster -lif x-01_clus1" command to verify the LIF's operational status is not "up" before using this command.
Do you want to continue? {y|n}: y

Error: command failed: LIF "x-01_clus1" failed to migrate: failed to move cluster/node-mgmt LIF.

Same problem, not able to revert.

Does someone of you know this problem and has a solution for me?

We haven't changed the config of the netapp in the last time, only the upgrade from 9.3 to 9.4.

Best regards

Florian

↧

Disable CP type U triggers

October 11, 2018, 9:12 am

≫ Next: creating flexgroup workflow

≪ Previous: Error reverting cluster LIF to home port after upgrade

CDOT 8.3.2P5

We recently installed the netapp-harvest/graphite/grafana combo and we're really impressed with the information we are getting out of it.

One of the things I've noticed is that the single biggest cause of CP events is the flush/sync trigger (CP type U).

Is it possible to configure ONTAP to disable/ignore this trigger?

I'm pretty sure that we would be happy to wait for the 10 second timer to expire to trigger the next CP event.

↧

creating flexgroup workflow

October 11, 2018, 9:43 am

≫ Next: Approach for Tracking Config Changes and Comparing Systems

≪ Previous: Disable CP type U triggers

Is it possible to create a new workflow in WFA to create a flexgroup, not just expand it? Is there a workflow available to download?

↧

Approach for Tracking Config Changes and Comparing Systems

October 11, 2018, 12:47 pm

≫ Next: Unable to take backup of WFA , also failing while trying to download logs

≪ Previous: creating flexgroup workflow

Guys, looking to do two things with my clusters:

Track all changes made by admins
Report on config drift between systems i.e. clusters that should have some exact config elements as another cluster

Item 1: I assume I can simply syslog eveything out to ELK/Splunk etc? Could NaBox help here? I see it has some elastic elements (logstatsh etc.) so can these be leveraged?

Item 2: An example of this would be that I have say two clusters that should have the exact same fpolicy and log forwadring configs. How to chek they are the same? Powershell script with comprae-objects seems one route.

Thanks in advance.

↧

Unable to take backup of WFA , also failing while trying to download logs

October 12, 2018, 1:38 am

≫ Next: How to identify a node by using commands

≪ Previous: Approach for Tracking Config Changes and Comparing Systems

Hi,

I am trying to take backup of WFA in which I have written lots of customised workflow.

But I am facing issue of "Unable to complete your action, please try again or contact your system administrator."

Also, facing the same issue while trying to download the logs after workflow execution.

↧

How to identify a node by using commands

October 14, 2018, 2:59 pm

≫ Next: New TR Released: TR-4380: SAN Migration Using Foreign LUN Import

≪ Previous: Unable to take backup of WFA , also failing while trying to download logs

I am trying to physically identify a node within a HA, are there any commands? I am not sure which one is which in the canbinet in Data Center. Thanks for your inputs!

↧

New TR Released: TR-4380: SAN Migration Using Foreign LUN Import

October 15, 2018, 5:11 am

≫ Next: single-path iSCSI setup incurs downtime during HA node upgrade?

≪ Previous: How to identify a node by using commands

This document provides guidance in planning for and migrating SAN (block) data using a five-phase data migration process designed to handle the most complex migration scenarios while heeding specific customer needs and specific environments.

For more info, please click

↧

single-path iSCSI setup incurs downtime during HA node upgrade?

October 15, 2018, 9:37 pm

≫ Next: False positive in log: vsdr.rootvol.has.data

≪ Previous: New TR Released: TR-4380: SAN Migration Using Foreign LUN Import

Environment:

ONTAP 9.1 on an HA pair

Windows Server 2008 R2 (no MPIO installed)

Hi,

I have an iSCSI LUN that's accesbile via an ifgrp on a node. When I upgrade the node in the HA pair, doesn't the interface fail over to other node therefore there is no downtime? In other words, even though I don't have MPIO set up to access both nodes, the iSCSI connetivity should not go down during the failover and giveback?

Thanks,

↧

False positive in log: vsdr.rootvol.has.data

October 16, 2018, 2:52 am

≫ Next: Content-Security-Policy HTTP header Not Implemented

≪ Previous: single-path iSCSI setup incurs downtime during HA node upgrade?

Hi , has anyone seen this before?

the error vsdr.rootvol.has.data is generated every few minutes for some svm's: Vserver svm01 has data in the root volume. Vserver DR does not protect data in the root volume. I checked the root vol of those svm's via cifs, and there is nothing in there besides shares. There are a lot of shares though, about 1800. Perhaps the pointer files themselves take up too much space? Does anyone know how the system detects that there is data on the root volume?

Here is the space used by a root volume:

volume show-space -vserver svm01 -volume svm01_root

Vserver: uzsvm01
Volume Name: svm01_root
Volume MSID: 2148629449
Volume DSID: 1026
Vserver UUID: x
Aggregate Name: x
Aggregate UUID: x
Hostname: x
User Data: 64KB
User Data Percent: 0%
Deduplication: -
Deduplication Percent: -
Temporary Deduplication: -
Temporary Deduplication Percent: -
Filesystem Metadata: 820KB
Filesystem Metadata Percent: 0%
SnapMirror Metadata: -
SnapMirror Metadata Percent: -
Tape Backup Metadata: -
Tape Backup Metadata Percent: -
Quota Metadata: -
Quota Metadata Percent: -
Inodes: 80KB
Inodes Percent: 0%
Inodes Upgrade: -
Inodes Upgrade Percent: -
Snapshot Reserve: 51.20MB
Snapshot Reserve Percent: 5%
Snapshot Reserve Unusable: -
Snapshot Reserve Unusable Percent: -
Snapshot Spill: -
Snapshot Spill Percent: -
Performance Metadata: 9.28MB
Performance Metadata Percent: 1%
Total Used: 61.42MB
Total Used Percent: 6%
Total Physical Used Size: 29.95MB
Physical Used Percentage: 3%

Thanks!

↧

Content-Security-Policy HTTP header Not Implemented

October 16, 2018, 11:23 am

≫ Next: Single Sign On

≪ Previous: False positive in log: vsdr.rootvol.has.data

Title of Vulnerability: Content Security Policy (CSP) Not Implemented - Risk Level: Moderate (CVSS=5.0)

Rationale/Finding Description: The NetApp devices web interface failed to implement the CSP protection. CSP, if implemented prevents cross-site scripting (XSS), clickjacking and other code injection attacks resulting from execution of malicious content in the trusted web page context.

It’s a browser side mechanism that allows to create whitelists for client side resources of the web interface (JavaScript, CSS, images, etc.). CSP is delivered via a special HTTP header that instructs the browser to only execute or render resources from the white list.

An attack requires publicly available tools, considerable amount of time and knowledge of the existing code injection weaknesses in the web interface.

A successful attack could allow an attacker to successfully exploit the web interface in the event of code injection attacks like XSS attacks.

Recommendation for Mitigation: Enable CSP on the web interface by sending the Content-Security-Policy in HTTP response headers. For example: Content-Security-Policy: default-src 'self'; script-src 'self'

↧

Single Sign On

February 14, 2018, 6:35 am

≫ Next: volume move fails with Not enough space in volume for snapshot operation

≪ Previous: Content-Security-Policy HTTP header Not Implemented

Greetings and Happy V-day,

Does NetApp currently have SSO for their filers, specically for mode 7? If they do, which version and do you have a link that I can get to so that I can configure the filers. Thank you in advance.

James

↧

volume move fails with Not enough space in volume for snapshot operation

October 16, 2018, 4:51 pm

≫ Next: AD functionality level with CIFS

≪ Previous: Single Sign On

I am trying to move a volume from aggra 1a to aggr 1c and my volume move fails with error: Error: Creating Snapshot copy with owner tag: Not enough space in volume for snapshot operation

I tried setting a snapshot reserve of 20% but the move after that failed with the same error. Any idea what I am missing?

↧

AD functionality level with CIFS

October 16, 2018, 8:31 pm

≫ Next: Copy/paste operation from 1 SVM to another

≪ Previous: volume move fails with Not enough space in volume for snapshot operation

We currently have a project underway to upgrade our domain functionality level from Server 2008R2 to Server 2016. Trying to find a reference as to what versions of ONTAP are supported or any thing I need to be aware of when this upgrade occurs. Does anyone have any experience doing this, or point me in the direction of any resources?

Running a combination of ONTAP 9.0 - 9.3 across multiple sites. All clusters serve CIFS file shares, joined to the domain in question.

Thanks,

James

↧