If the site replication is enabled and the code tries to extract jwt
claims while the site replication service account credentials are still
not loaded yet, the code will enter an infinite loop, causing in a
high CPU usage.
Another possibility of the infinite loop is having some service accounts
created by an old deployment version where the service account JWT was
signed by the root credentials, but not anymore.
This commit will remove the possibility of the infinite loop in the code
and add root credential fallback to extract claims from old service
accounts.
Hot load a policy document when during account authorization evaluation
to avoid returning 403 during server startup, when not all policies are
already loaded.
Add this support for group policies as well.
the reason for this is to avoid STS mappings to be
purged without a successful load of other policies,
and all the credentials only loaded successfully
are properly handled.
This also avoids unnecessary cache store which was
implemented earlier for optimization.
If used, 'opts.Marker` will cause many missed entries since results are returned
unsorted, and pools are serialized.
Switch to fully concurrent listing and merging across pools to return sorted entries.
Create new code paths for multiple subsystems in the code. This will
make maintaing this easier later.
Also introduce bugLogIf() for errors that should not happen in the first
place.
This fixes a bug where STS Accounts map accumulates accounts in memory
and never removes expired accounts and the STS Policy mappings were not
being refreshed.
The STS purge routine now runs with every IAM credentials load instead
of every 4th time.
The listing of IAM files is now cached on every IAM load operation to
prevent re-listing for STS accounts purging/reload.
Additionally this change makes each server pick a time for IAM loading
that is randomly distributed from a 10 minute interval - this is to
prevent server from thundering while performing the IAM load.
On average, IAM loading will happen between every 5-15min after the
previous IAM load operation completes.
Fix races in IAM cache
Fixes#19344
On the top level we only grab a read lock, but we write to the cache if we manage to fetch it.
a03dac41eb/cmd/iam-store.go (L446) is also flipped to what it should be AFAICT.
Change the internal cache structure to a concurrency safe implementation.
Bonus: Also switch grid implementation.
This allows batch replication to basically do not
attempt to copy objects that do not have read quorum.
This PR also allows walk() to provide custom
values for quorum under batch replication, and
key rotation.
To ensure that policy mappings are current for service accounts
belonging to (non-derived) STS accounts (like an LDAP user's service
account) we periodically reload such mappings.
This is primarily to handle a case where a policy mapping update
notification is missed by a minio node. Such a node would continue to
have the stale mapping in memory because STS creds/mappings were never
periodically scanned from storage.
This helps reduce disk operations as these periodic routines would not
run concurrently any more.
Also add expired STS purging periodic operation: Since we do not scan
the on-disk STS credentials (and instead only load them on-demand) a
separate routine is needed to purge expired credentials from storage.
Currently this runs about a quarter as often as IAM refresh.
Also fix a bug where with etcd, STS accounts could get loaded into the
iamUsersMap instead of the iamSTSAccountsMap.
In situations with large number of STS credentials on disk, IAM load
time is high. To mitigate this, STS accounts will now be loaded into
memory only on demand - i.e. when the credential is used.
In each IAM cache (re)load we skip loading STS credentials and STS
policy mappings into memory. Since STS accounts only expire and cannot
be deleted, there is no risk of invalid credentials being reused,
because credential validity is checked when it is used.
Sometimes IAM fails to load certain items, which could be a user,
a service account or a policy but with not enough information for
us to debug.
This commit will create a more descriptive error to make it easier to
debug in such situations.
```
commit 7bdaf9bc50
Author: Aditya Manthramurthy <donatello@users.noreply.github.com>
Date: Wed Jul 24 17:34:23 2019 -0700
Update on-disk storage format for users system (#7949)
```
Bonus: fixes a bug when etcd keys were being re-encrypted.
listConfigItems creates a goroutine but sometimes callers will
exit without properly asking listAllIAMConfigItems() to stop sending
results, hence a goroutine leak.
Create a new context and cancel it for each listAllIAMConfigItems
call.
- This introduces a new admin API with a query parameter (v=2) to return a
response with the timestamps
- Older API still works for compatibility/smooth transition in console
This reverts commit 091a7ae359.
- Ensure all actions accessing storage lock properly.
- Behavior change: policies can be deleted only when they
are not associated with any active credentials.
Also adds fix for accidental canned policy removal that was present in the
reverted version of the change.
- Ensure all actions accessing storage lock properly.
- Behavior change: policies can be deleted only when they
are not associated with any active credentials.
IAMSys is a higher-level object, that should not be called by the lower-level
storage API interface for IAM. This is to prepare for further improvements in
IAM code.
Additional support for vendor-specific admin API
integrations for OpenID, to ensure validity of
credentials on MinIO.
Every 5minutes check for validity of credentials
on MinIO with vendor specific IDP.
This is to ensure that there are no projects
that try to import `minio/minio/pkg` into
their own repo. Any such common packages should
go to `https://github.com/minio/pkg`