[Libstoragemgmt-devel] [RFC] Block level python API document draft done.

Discussion:

Gris Ge

2014-04-14 10:25:42 UTC

Hi Team,

I just finished the block API document drafting.[1]
It defines how the API will look like and how user should use them.
# Yes. Create a document for plugin developer is on TODO list(for a
# while).

Anything not listed in that document will be purged. If I missed
anything, let me know.

Any feedback will be appreciated.

Thank you in advance.

[1] https://sourceforge.net/p/libstoragemgmt/wiki/Python_API_Usage/

TODO:
0. Document review.
# The changelog will be included in patch.
1. The lsm.MaskInfo and lsm.VolumeReplication has no code yet.
# I will take the initial work and update on feedback.
# I could use some help on C/C++ codes.
# Due to well-known complexity of SMI-S on replication,
# there might be no VolumeReplication plugin implementation in a
# short time. Only simulator will be ready for it.
2. The lsm.AccessGroup related cleanup.
# I will do this after MaskInfo done.
3. File API documenting:
* File System -- lsm.FileSystem
* File System Export -- lsm.NfsExport
* File System Snapshot -- lsm.FsSnapshot
* File System Clone -- lsm.FsClone
# I will work on this if no one take this task.
4. Sync existing codes with documents.
* lsm.System
* lsm.Disk
* lsm.Volume
5. Sync plugins with API changes.
6. Move value converting from _data.py to lsmcli_data_display.py

Tony,

If we'd like to speed up on API stabilization, could you help me on
C/C++ codes and git tree commit work?

Thanks.
Best regards.

--
Gris Ge

Tony Asleson

2014-04-14 18:10:30 UTC

Permalink

Post by Gris Ge
Hi Team,
I just finished the block API document drafting.[1]
It defines how the API will look like and how user should use them.
# Yes. Create a document for plugin developer is on TODO list(for a
# while).
Anything not listed in that document will be purged. If I missed
anything, let me know.

* Capabilities method missing usage, parameters, returns etc.
* Disk.
- We had previously agreed on keeping block size and number of blocks,
documentation only has raw_size_bytes.
- Do we need the status_info field, or can we report it through one of
the other status_info fields? I don't remember if we agreed to make
these mandatory across all classes. If a user gets system, will the
status_info inform about a disk error or require the user to get the
status_info on specific disk?
- usable_size_bytes, we should remove this. Some arrays may have more
than one raid type utilizing regions of the same disk, thus this field
would make no sense.

* Pool
- Thinking we should make element_type mandatory, would this be an issue?
- ELEMENT_TYPE_SYS_RESERVED, I believe we should remove this and not
return pools of this type to the user, thus this wouldn't be needed

* pool_create_from_disks: We need a way to tell which disks can be used
for this method.

* lsm.Client.volumes()
- Same question on status_info from above
- Are we going to support all search fields for all plug-ins?

* VolumeReplication, wondering if we can come up with a better name?

* 9.9 Copy paste error?, volume_replication_giveback documented as
volume_replication_failover

* volume_replication_giveback: What arrays support this? What are the
differences between volume_replication_restore and
volume_replication_giveback other than different types of supported
replication types?

Post by Gris Ge
Any feedback will be appreciated.
Thank you in advance.
[1] https://sourceforge.net/p/libstoragemgmt/wiki/Python_API_Usage/
0. Document review.
# The changelog will be included in patch.
1. The lsm.MaskInfo and lsm.VolumeReplication has no code yet.
# I will take the initial work and update on feedback.
# I could use some help on C/C++ codes.
# Due to well-known complexity of SMI-S on replication,
# there might be no VolumeReplication plugin implementation in a
# short time. Only simulator will be ready for it.
2. The lsm.AccessGroup related cleanup.
# I will do this after MaskInfo done.
* File System -- lsm.FileSystem
* File System Export -- lsm.NfsExport
* File System Snapshot -- lsm.FsSnapshot
* File System Clone -- lsm.FsClone
# I will work on this if no one take this task.
4. Sync existing codes with documents.
* lsm.System
* lsm.Disk
* lsm.Volume
5. Sync plugins with API changes.
6. Move value converting from _data.py to lsmcli_data_display.py
Tony,
If we'd like to speed up on API stabilization, could you help me on
C/C++ codes and git tree commit work?

Sure. Some of these change although appearing small are going to
require quite a bit of code changes.

Make sure as we re-factor that tests gets modified and updated and
continue to run clean.

Regards,
Tony

Gris Ge

2014-04-15 11:04:54 UTC

Permalink

Post by Tony Asleson
* Capabilities method missing usage, parameters, returns etc.

Sorry I miss that part. Added. But I have some concerns:

Currently, we have these return on lsm.Capabilities.get():
lsm.Capabilities.UNSUPPORTED
lsm.Capabilities.SUPPORTED
lsm.Capabilities.SUPPORTED_OFFLINE
lsm.Capabilities.NOT_IMPLEMENTED
lsm.Capabilities.UNKNOWN

If we document them, we have to add more lines in document of
every method about how user should proceed when getting
'lsm.Capabilities.UNKNOWN' and 'lsm.Capabilities.SUPPORTED_OFFLINE'.

May I suggest us to simplify them just return 'True' and 'False'?

Current design:
if lsm_cap.get(lsm.Capabilities.DISKS) == lsm.Capabilities.SUPPORTED:
lsm_disks = lsm_cli.disks()
Suggest design:
if lsm_cap.get(lsm.Capabilities.DISKS):
lsm_disks = lsm_cli.disks()

Post by Tony Asleson
* Disk.
- We had previously agreed on keeping block size and number of blocks,
documentation only has raw_size_bytes.

Added. Thanks.

Post by Tony Asleson
- Do we need the status_info field, or can we report it through one of
the other status_info fields? I don't remember if we agreed to make
these mandatory across all classes. If a user gets system, will the
status_info inform about a disk error or require the user to get the
status_info on specific disk?

Sounds better if we only keep lsm.System and lsm.Pool holding
status_info.
Document updated: only lsm.System and lsm.Pool has status_info property.

Post by Tony Asleson
- usable_size_bytes, we should remove this. Some arrays may have more
than one raid type utilizing regions of the same disk, thus this field
would make no sense.

Removed. Reasonable. Thanks.

Post by Tony Asleson
* Pool
- Thinking we should make element_type mandatory, would this be an issue?
- ELEMENT_TYPE_SYS_RESERVED, I believe we should remove this and not
return pools of this type to the user, thus this wouldn't be needed

NetApp ONTAP allowing user to use system pool(aggr0), I don't we should
override settings.
IBM XIV only has a system pool where all other pool is sub pool of it.
If we hiding system pool, there is no way for us to know there total
usable size of IBM XIV. (Actually, current code does not show system
pool, but I intend to show them in the future, it help user to determine
the space when creating sub pool).
I am OK to promote 'element_type' to mandatory and remove
'ELEMENT_TYPE_SYS_RESERVED'. But I will not agree on hiding system pool.

Post by Tony Asleson
* pool_create_from_disks: We need a way to tell which disks can be used
for this method.

How about lsm.Disk.disk_role property:
lsm.Disk.DISK_ROLE_POOL_MEMBER
# Acting as pool member.
lsm.DISK.DISK_ROLE_SPARE
# Acting as spare disk.
lsm.DISK.DISK_ROLE_FREE
# Can be used for pool creating(if support) or spare disk(if
# support)
lsm.DISK.DISK_ROLE_SYSTEM_RESERVE
# Not usable.
lsm.DISK.DISK_ROLE_UNKNOWN
# Not sure

Post by Tony Asleson
* lsm.Client.volumes()
- Same question on status_info from above
- Are we going to support all search fields for all plug-ins?

Yes. In order to save the search time and transfer data, plugin should
do it.
To save plugin developer load, we can allowing plugin setting a flag(or
other magic thing) to inform _client.py whether they can handle
searching(during plugin_startup() maybe). If not, _client.py will use
the routine way to search/filter the returned data.

Post by Tony Asleson
* VolumeReplication, wondering if we can come up with a better name?

Some document said 'Replica'.
I am not native English speaker with no creative mine. Do you any name
in mind?
BTW, there is still having FsSnapshot, FsClone, should we merge them
into 'FsReplica'(need better name also)?

Post by Tony Asleson
* 9.9 Copy paste error?, volume_replication_giveback documented as
volume_replication_failover

Fixed. Thanks.

Post by Tony Asleson
* volume_replication_giveback: What arrays support this? What are the
differences between volume_replication_restore and
volume_replication_giveback other than different types of supported
replication types?

Restore is copy target volume data back to source volume.
Failover and giveback is only for SYNC_REMOTE and
ASYNC_REMOTE(technically SYNC_LOCAL and ASYNC_LOCAL also support this,
but no vendor support this yet):

Normal state:
PR(Production) site:
source volume is RW
DR site:
target volume is RO or NO_ACCESS.
Sync status:
Syncing with SYNC or ASYNC

Failover:
PR(Production) site:
source volume is NO_ACCESS or RO or the whole site completely down.
DR site:
target volume is RW. DR site is acting as production
environment to handling all user data.
Sync status:
No sync

Giveback:
PR(Production) site:
source volume is NO_ACCESS or RO or RW.
Depending on vendor design.
DR site:
target volume is NO_ACCESS or RO or RW.
Depending on vendor design.
Sync status:
Data is copying from DR site target volume to PR site source
volume.

EMC VMAX/Symmetrix SRDF and SRDF/A support failover and giveback.
Since this use case is pretty common, I believe NetApp support this
also.

Post by Tony Asleson
Sure. Some of these change although appearing small are going to
require quite a bit of code changes.
Make sure as we re-factor that tests gets modified and updated and
continue to run clean.

Noted.

Post by Tony Asleson
Regards,
Tony

Thanks.

--
Gris Ge

Tony Asleson

2014-04-15 16:53:04 UTC

Permalink

Post by Gris Ge

Post by Tony Asleson
* Capabilities method missing usage, parameters, returns etc.

lsm.Capabilities.UNSUPPORTED
lsm.Capabilities.SUPPORTED
lsm.Capabilities.SUPPORTED_OFFLINE
lsm.Capabilities.NOT_IMPLEMENTED
lsm.Capabilities.UNKNOWN
If we document them, we have to add more lines in document of
every method about how user should proceed when getting
'lsm.Capabilities.UNKNOWN' and 'lsm.Capabilities.SUPPORTED_OFFLINE'.

I just checked, I'm using this wrong in the plugin test code ( eg. if
lsm_cap.get(lsm.Capabilities.DISKS)). It kind of works because
unsupported is 0 which evaluates to false in Python. I got it correct
when I wrote the code in the command line, but I had just written the
capability code then.

Post by Gris Ge
May I suggest us to simplify them just return 'True' and 'False'?

I agree this would be more intuitive, but I'm not sure we can do this.

There are some arrays which will need to have a volume taken offline
before you can issue a re-size for example. In these cases internally
we could take it off line and then do the operation and then return it
to online status if it still exists, but then the behavior would be
different. For example some arrays can handle a re-size for volume
while IO in progress.

Gris Ge

2014-04-16 13:42:44 UTC

Permalink

Post by Tony Asleson
I just checked, I'm using this wrong in the plugin test code ( eg. if
lsm_cap.get(lsm.Capabilities.DISKS)). It kind of works because
unsupported is 0 which evaluates to false in Python. I got it correct
when I wrote the code in the command line, but I had just written the
capability code then.

Post by Gris Ge
May I suggest us to simplify them just return 'True' and 'False'?

I agree this would be more intuitive, but I'm not sure we can do this.
There are some arrays which will need to have a volume taken offline
before you can issue a re-size for example. In these cases internally
we could take it off line and then do the operation and then return it
to online status if it still exists, but then the behavior would be
different. For example some arrays can handle a re-size for volume

Tony Asleson

2014-04-16 17:59:13 UTC

Permalink

In stead of forcing every method to handle SUPPORTED_OFFLINE for one use
lsm.Capabilities.VOLUME_RESIZE_ONLINE
# Without this capability, volume_resize() only handle
# STATUS_STOPPED volume which can be done by volume_offline()
LsmError
lsm.ErrorNumber.VOLUME_RESIZE_NO_SUPPORT_ONLINE
# Does not support online resize.

Sure, this will work. We can also add it for other methods which can't
be done while in use, like FS re-size too.

However, can we do this for the implementation. Have just
unsupported/supported, but leave as enumerated types. Keep the methods
get/set which are used to retrieve enumerated types and introduce a new
method supported which returns a boolean?

This way if we need to introduce additional states we can with the
get/set and supported will work

For example you have (4) 2TB disks which are empty (ROLE_FREE). You
should be able to use the same disks to make a ~3.2TB RAID5 and a ~2TB
RAID10. What would the enumerated type be for the disks after the first
pool create? Can we accommodate this use case?

I only found EMC DMX/VMAX support using disk slice to create pool. Any
other vendor support this?

Well you can do this with lvm and it's internal supported RAID types.

I also believe XIO does/used too and most likely Dell/Compellent support
this, but the user didn't have to specify which specific parts of a
disk, to use. They specify the raid type and select what disks that are
available which have free space or at least that is what I remember anyway.

disk_type = lsm.Disk.DISK_TYPE_DISK_SLICE
1. DISK_SAS_SLICE and more constants.
2. Where did this slice come from. lsm.Disk.slice_owner?
3. More capability to guide user to use slice or whole disk for
pool creation.
4. Need more method to create disk slices.
B. If disk slice is supported, create a logical pool holding all disks,
and real pool is sub-pool of this one. The sub-pool can have their
own RAID, ThinP settings.
# This is actually how EMC DMX/VMAX representing their pool in
# their own SMI-S provider.
* lsm.Pool
id: MAIN_POOL_0
element_type: lsm.Pool.ELEMENT_TYPE_POOL
raid_type: JBOD
member_type: Disk
member_ids: [ DISK_1, DISK_2, DISK_3, DISK_4]
* lsm.Pool
id: SUB_POOL_1
raid_type: RAID 5
size: 3.2TiB
member_type: Pool
member_ids: [MAIN_POOL_0]
element_type: Volume or/and FS
* lsm.Pool
id: SUB_POOL_2
raid_type: RAID 10
size: 2TiB
member_type: Pool
member_ids: [MAIN_POOL_0]
element_type: Volume or/and FS
# About disk status, disk_role will always been "POOL_MEMEBER" no
# mater with or without pool created. User can only create
# sub-pool. (To support this, I have to update the wiki on
# lsm.Pool, current sub-pool only allow raid_type == NO_APPLICABLI)
I prefer idea B), I list idea A) as bad example.
The wiki update has been updated with idea B) for above sample and
comments.
How's that sounds?

Option B. should work theoretically, that is unless the array itself
doesn't support sub pools. The method we have today to create sub pool
doesn't take a raid type, so we would need to add that.

Would it be possible to introduce an enumerated disk type that indicates
it is part of a pool, but is available to use for new pool creation?
Something like: lsm.Disk.DISK_ROLE_POOL_PARTIAL_MEMBER? Then when the
disk is fully consumed it would transition to
lsm.Disk.DISK_ROLE_POOL_MEMBER?

We can leave this ability out for now too and add it later by extending
the API, that's acceptable.

Yes, perhaps we could use FsDeltaRO, FsDeltaRW or are you suggesting we
change to single call with a copy type?

I prefer 'FsReplication' containing all copy types. It provide less class and
methods.

I think FsReplicate with different copy types would be good as to not
confuse the user with VolumeReplication which is a relationship, not an
action.

So the giveback is really more than just making the data the same, it
also changes the access to the source and target volumes. Are these
directly modeled after SMI-S? I haven't explored the docs on these in
depth.

[IN, Required] uint16 Operation,
* Failover
Enable the read and write operations from the host to the target element.
This operation useful for situations when the source element is
unavailable.
* Failback
Switch the read/write activities from the host back to source
element. Update source element from target element with writes to
target during the failover period.
I chose 'giveback' as it seems more suitable.

I'm OK with either term. I think the important part is that we document
these methods well, so that it is clear what there intended purpose is.

Thanks!

Regards,
Tony

Gris Ge

2014-04-19 08:32:25 UTC

Permalink

Post by Tony Asleson
Sure, this will work. We can also add it for other methods which can't
be done while in use, like FS re-size too.
However, can we do this for the implementation. Have just
unsupported/supported, but leave as enumerated types. Keep the methods
get/set which are used to retrieve enumerated types and introduce a new
method supported which returns a boolean?
This way if we need to introduce additional states we can with the
get/set and supported will work

Hi Tony,

I am very sorry for the late reply. I have suffered a severe data lose
on my laptop which lost all data of it(Including all my on-going work
about MaskInfo). I just installed and setuped my OS and restored some
data from my backup.

Sounds like your proposal is:
lsm.Capabilities.support() # return True or False
lsm.Capabilities.get() # return enumerated types.

We have online/offline is handled by capability and LsmError, which left:
lsm.Capabilities.UNSUPPORTED
lsm.Capabilities.NOT_IMPLEMENTED
lsm.Capabilities.SUPPORTED
lsm.Capabilities.UNKNOWN

I treat lsm.Capabilities as the beacon for 'go' or 'not go'.
The removal of 'lsm.Capabilities.UNKNOWN' could clear this blur beacon state.

At last, 'lsm.Capabilities.NOT_IMPLEMENTED' is the only reason we keep
lsm.Capabilities.get().

I assume 'NOT_IMPLEMENTED' mean storage array support it, but plugin or
their control SDK/API does not code this out. We use this to indicate
user can get this capability some day in the future but not sure.

Any other use case I missed to keep the enumerated type?

Post by Tony Asleson
Option B. should work theoretically, that is unless the array itself
doesn't support sub pools. The method we have today to create sub pool
doesn't take a raid type, so we would need to add that.

Already added the wiki page for supporting raid in sub-pool.

Post by Tony Asleson
Would it be possible to introduce an enumerated disk type that indicates
it is part of a pool, but is available to use for new pool creation?
Something like: lsm.Disk.DISK_ROLE_POOL_PARTIAL_MEMBER? Then when the
disk is fully consumed it would transition to
lsm.Disk.DISK_ROLE_POOL_MEMBER?
We can leave this ability out for now too and add it later by extending
the API, that's acceptable.

The 'PARTIAL_MEMBER' looks good to me now. But I'd like to investigate on
plugin implementation on before we nail it.
Let's postpone it after API release. It's a acceptable addition.

Post by Tony Asleson

Yes, perhaps we could use FsDeltaRO, FsDeltaRW or are you suggesting we
change to single call with a copy type?

I prefer 'FsReplication' containing all copy types. It provide less
class and methods.

I think FsReplicate with different copy types would be good as to not
confuse the user with VolumeReplication which is a relationship, not an
action.

You mean 'FsReplicate' create new target file system?

Post by Tony Asleson
I'm OK with either term. I think the important part is that we document
these methods well, so that it is clear what there intended purpose is.

Will do.

Post by Tony Asleson
Thanks!
Regards,
Tony

--
Gris Ge