Issue
Pandas dataframe has info
method by which we can see its schema.
df = pd.read_csv(titanic_file)
df.info()
---
RangeIndex: 627 entries, 0 to 626
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 survived 627 non-null int64
1 sex 627 non-null object
2 age 627 non-null float64
3 n_siblings_spouses 627 non-null int64
4 parch 627 non-null int64
5 fare 627 non-null float64
6 class 627 non-null object
7 deck 627 non-null object
8 embark_town 627 non-null object
9 alone 627 non-null object
dtypes: float64(2), int64(3), object(5)
memory usage: 49.1+ KB
What is equivalent in Tensorflow dataset other than checking column by column?
titanic = tf.data.experimental.make_csv_dataset(
titanic_file,
label_name="survived",
batch_size=1, # To compre with the head of CSV
shuffle=False, # To compre with the head of CSV
header=True,
)
for row in titanic.take(1): # Take the first batch
features = row[0] # Diectionary
label = row[1]
for feature, value in features.items():
print(f"{feature:20s}: {value.dtype}")
print(f"label/survived : {label.dtype}")
---
sex : <dtype: 'string'>
age : <dtype: 'float32'>
n_siblings_spouses : <dtype: 'int32'>
parch : <dtype: 'int32'>
fare : <dtype: 'float32'>
class : <dtype: 'string'>
deck : <dtype: 'string'>
embark_town : <dtype: 'string'>
alone : <dtype: 'string'>
label/survived : <dtype: 'int32'>
Solution
The closest thing that comes to my mind is tf.data.experimental.get_structure
import tensorflow as tf
import tensorflow_datasets as tfds
# Construct a tf.data.Dataset
ds = tfds.load('mnist', split='train', shuffle_files=True)
tf.data.experimental.get_structure(ds)
Out:
{'image': TensorSpec(shape=(28, 28, 1), dtype=tf.uint8, name=None),
'label': TensorSpec(shape=(), dtype=tf.int64, name=None)}
And for titanic dataset (columns might be slightly different depending on source):
(OrderedDict([('PassengerId',
TensorSpec(shape=(1,), dtype=tf.int32, name=None)),
('Pclass', TensorSpec(shape=(1,), dtype=tf.int32, name=None)),
('Name', TensorSpec(shape=(1,), dtype=tf.string, name=None)),
('Sex', TensorSpec(shape=(1,), dtype=tf.string, name=None)),
('Age', TensorSpec(shape=(1,), dtype=tf.float32, name=None)),
('SibSp', TensorSpec(shape=(1,), dtype=tf.int32, name=None)),
('Parch', TensorSpec(shape=(1,), dtype=tf.int32, name=None)),
('Ticket', TensorSpec(shape=(1,), dtype=tf.string, name=None)),
('Fare', TensorSpec(shape=(1,), dtype=tf.float32, name=None)),
('Cabin', TensorSpec(shape=(1,), dtype=tf.string, name=None)),
('Embarked',
TensorSpec(shape=(1,), dtype=tf.string, name=None))]),
TensorSpec(shape=(1,), dtype=tf.int32, name=None))
Answered By - Proko
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.